Nucleotide analogs

ABSTRACT

Provided herein is technology relating to the manipulation and detection of nucleic acids, including but not limited to compositions, methods, and kits related to nucleotides comprising a chemically reactive linking moiety.

FIELD OF INVENTION

Provided herein is technology relating to the manipulation and detection of nucleic acids, including but not limited to compositions, methods, and kits related to nucleotides comprising a chemically reactive linking moiety.

BACKGROUND

Nucleic acid detection methodologies continue to serve as a critical tool in the field of molecular diagnostics. The ability to manipulate biomolecules specifically and efficiently provides the basis for many successful detection technologies. For example, linking a chemical, biological, or physical moiety (e.g., adding a “tag”) to a biomolecule of interest is one key technology related to the subsequent manipulation, detection, and/or identification of the biomolecule.

Conventional linking technologies often rely on enzyme-assisted methods. For example, some methods to append a desired tag onto a target DNA use a ligase enzyme to join the target DNA to the tag (e.g., another DNA fragment comprising the tag, another DNA fragment to serve as the tag itself, etc.). In another method, a polymerase enzyme incorporates a tag-modified substrate of the polymerase (e.g., a dNTP or a modified-dNTP) into a nucleic acid. An advantage of these enzyme-assisted methods is that the links joining the biomolecule to the moiety are “natural” linkages that allow further manipulation of the conjugated product. However, some important drawbacks include low product yields, inefficient reactions, and low specificity due to multiple reactive groups present on a target biomolecule that the enzyme can recognize. In addition, conventional methods have high costs in both time and money.

SUMMARY

Accordingly, provided herein is technology related to linking moieties to biomolecules using chemical conjugation. These linkage reactions are more specific and efficient that conventional technologies because the reactions are designed to include a mechanism of conjugation between specific chemical moieties.

While most conventional chemical covalent linkages are not recognized and/or processed by biological catalysts (e.g., enzymes), thus limiting subsequent manipulation of the conjugated product, the technology described herein provides a chemical linkage that allows downstream manipulation of the conjugated product by standard molecular biological and biochemical techniques.

For example, while there are many nucleotide analogs currently available that can terminate a polymerase reaction (e.g., dideoxynucleotides and various 3′ modified nucleotide analogs), these molecules inhibit or severely limit further manipulation of nucleic acids terminated by these analogs. For example, subsequent enzymatic reactions such as the polymerase chain reaction are completely or substantially inhibited by the nucleotide analogs. In addition, some solutions have utilized nucleotide analogs called “reversible terminators” in which the 3′ hydroxyl groups are capped with a chemical moiety that can be removed with a specific chemical reaction, thus regenerating a free 3′ hydroxyl. Use of these nucleotide analogs, however, requires the additional deprotection (uncapping) step to remove the protecting (capping) moiety from the nucleic acid as well as an additional purification step to remove the released protecting (capping) moiety from the reaction mixture.

In contrast to conventional technologies, provided herein is technology related to the design, synthesis, and use of nucleotide (e.g., ribonucleotide, deoxyribonucleotide) analogs that comprise chemically reactive groups. For example, some embodiments provide a nucleotide analog comprising an alkyne group, e.g., a nucleotide comprising a 3′ alkyne group such as provided in embodiments of the technology related to a 3′-O-propargyl deoxynucleotides. The chemical groups and linkages do not impair or significantly limit the use of subsequent molecular biological techniques to manipulate compounds (e.g., nucleic acids, conjugates, and other biomolecules) comprising the nucleotide analogs. As such, the compounds (e.g., nucleic acids, conjugates, and other biomolecules) comprising the described nucleotide analogs are useful for many applications.

In some embodiments, nucleotide analogs find use as functional nucleotide terminators, that is, the nucleotide analogs terminate synthesis of a nucleic acid by a polymerase and additionally comprise a functional reactive group for subsequent chemical and/or biochemical processing, reaction, and/or manipulation. In particular, some embodiments provide a nucleotide analog in which the 3′ hydroxyl group is capped by a chemical moiety comprising, e.g., an alkyne (e.g., a carbon-carbon triple bond, e.g., C≡C). When the 3′ alkyne nucleotide analog is incorporated into a nucleic acid by a polymerase (e.g., a DNA and/or RNA polymerase) during synthesis, further elongation of the nucleic acid is halted (“terminated”) because the nucleic acid does not have a free 3′ hydroxyl to provide the proper substrate for subsequent nucleotide addition.

While the nucleotide analogs are not a natural substrate for conventional molecular biological enzymes, the alkyne chemical moiety is a well-known chemical conjugation partner reactive with particular functional moieties. For example, an alkyne reacts with an azide group (e.g., N₃, e.g., N═N═N) in a copper (I)-catalyzed azide-alkyne cycloaddition (“CuAAC”) reaction to form two new covalent bonds between azide nitrogens and alkyl carbons. The covalent bonds form a chemical link (e.g., comprising a five-membered triazole ring) between a first component and a second component that comprised the azide and the alkyne moieties before linkage. This type of cycloaddition reaction is one of the foundational reactions of “click chemistry” because it provides a desirable chemical yield, is physiologically stable, and exhibits a large thermodynamic driving force that favors a “spring-loaded” reaction that yields a single product (e.g., a 1,4-regioisomer of 1,2,3-triazole). See, e.g., Huisgen (1961) “Centenary Lecture-1,3-Dipolar Cycloadditions”, Proceedings of the Chemical Society of London 357; Kolb, Finn, Sharpless (2001) “Click Chemistry: Diverse Chemical Function from a Few Good Reactions”, Angewandte Chemie International Edition 40(11): 2004-2021. For example:

The reaction can be performed in a variety of solvents, including aqueous mixtures, compositions comprising water and/or aqueous mixtures, and a variety of organic solvents including compositions comprising alcohols, dimethyl sulfoxide (DMSO), dimethylformamide (DMF), tert-butyl alcohol (TBA or tBuOH; also known as 2-methyl-2-propanol (2M2P)), and acetone. In some embodiments, the reaction is performed in a milieu comprising a copper-based catalyst such as Cu/Cu(OAc)₂, a tertiary amine such as tris-(benzyltriazolylmethyl)amine (TBTA), and/or tetrahydrofuran and acetonitrile (THF/MeCN).

In some embodiments, the triazole ring linkage has a structure according to:

The triazole ring linkage formed by the alkyne-azide cycloaddition has similar characteristics (e.g., physical, biological, biochemical, chemical characteristics, etc.) as a natural phosphodiester bond present in nucleic acids and therefore is a nucleic acid backbone mimic. Consequently, conventional enzymes that recognize natural nucleic acids as substrates also recognize as substrates the products formed by alkyne-azide cycloaddition as provided by the technology described herein. See, e.g., El-Sagheer et al. (2011) “Biocompatible artificial DNA linker that is read through by DNA polymerases and is functional in Escherichia coli”, Proc Natl Acad Sci USA 108(28): 11338-43.

In some embodiments, the use of nucleotide analogs comprising an alkyne (e.g., a 3′-O-propargyl nucleotide analog) produces nucleic acids (e.g., DNA or RNA polynucleotide fragments) that have a terminal 3′ alkyne group. For example, in some embodiments, nucleotide analogs comprising an alkyne (e.g., a 3′-O-propargyl nucleotide analog) are incorporated into a growing strand of a nucleic acid in a polymerase extension reaction; once incorporated, the nucleotide analogs halt the polymerase reaction. These terminated nucleic acids are an appropriate chemical reactant for a click chemistry reaction (e.g., alkyne-azide cycloaddition), e.g., for a chemical ligation to an azide-modified molecule such as a 5′-azide modified nucleic acid, a labeling moiety comprising an azide, a solid support comprising an azide, a protein comprising an azide, etc., including, but not limited to moieties, entities, and components discussed herein. In some embodiments, for example, the 3′-O-propargyl group at the 3′ terminal of the nucleic acid product is used in a tagging reaction with an azide-modified tag using chemical ligation, e.g., as provided by a click chemistry reaction. The covalent linkage created using this chemistry mimics that of a natural nucleic acid phosphodiester bond, thereby providing for the use of the chemically ligated nucleic acids in subsequent enzymatic reactions, such as a polymerase chain reaction, with the triazole chemical linkage causing minimal, limited, or undetectable (e.g., no) inhibition of the enzymatic reaction.

In some embodiments, the nucleotide analog comprising an alkyne is reacted with a reactant comprising a phosphine moiety in a Staudinger ligation. In a Staudinger ligation, an electrophilic trap (e.g., a methyl ester) is placed on a triarylphosphine aryl group (usually ortho to the phosphorus atom) and reacted with the azide to yield an aza-ylide intermediate, which then rearranges (e.g., in aqueous media) to produce a compound with amide group and a phosphine oxide function. The Staudinger ligation ligates (attaches and covalently links) the two starting molecules together.

Accordingly, provided herein is technology related to a composition comprising a nucleotide analog having a structure according to:

wherein B is a base and P comprises a phosphate moiety. In some embodiments, P comprises a tetraphosphate; a triphosphate; a diphosphate; a monophosphate; a 5′ hydroxyl; an alpha thiophosphate (e.g., phosphorothioate or phosphorodithioate), a beta thiophosphate (e.g., phosphorothioate or phosphorodithioate), and/or a gamma thiophosphate (e.g., phosphorothioate or phosphorodithioate); or an alpha methylphosphonate, a beta methylphosphonate, and/or a gamma methylphosphonate.

In some embodiments, P comprises an azide (e.g., N₃, e.g., N═N═N), thus providing, in some embodiments, a directional, bi-functional polymerization agent as described herein.

In some embodiments, B is a cytosine, guanine, adenine, thymine, or uracil base. That is, in some embodiments, B is a purine or a pyrimidine or a modified purine or a modified pyrimidine. The technology is not limited in the bases B that find use in the nucleotide analogs. For example, B can be any synthetic, artificial, or natural base; thus, in some embodiments B is a synthetic base; in some embodiments, B is an artificial base; in some embodiments, B is a natural base. In some embodiments, compositions comprise a nucleotide analog and a nucleic acid (e.g., a polynucleotide). Compositions in some embodiments further comprise a polymerase and/or a nucleotide (e.g., a conventional nucleotide). In compositions comprising a nucleotide and a nucleotide analog, in some embodiments the number ratio of the nucleotide analog to the nucleotide is 1:1, 1:2, 1:3, 1:4, 1:5, 1:10, 1:15, 1:20, 1:25, 1:30, 1:50, 1:75, 1:100, 1:200, 1:300, 1:400, 1:500, 1:600, 1:700, 1:800, 1:900, 1:1000, 1:5000, or 1:10000.

In some embodiments, a nucleic acid comprises a nucleotide analog as provided herein. In some embodiments, the nucleic acid comprises the nucleotide analog at its 3′ end (e.g., the nucleotide analog is at the 3′ end of the nucleic acid). The technology, in some embodiments relates to the synthesis of a nucleic acid comprising a nucleotide analog by a biological enzyme. That is, the biological enzyme recognizes the nucleotide analog as a substrate and incorporates the nucleotide analog into the nucleic acid. For example, in some embodiments, the nucleic acid is produced by a polymerase.

In some embodiments, the compositions further comprise an azide, e.g., a component, entity, molecule, surface, biomolecule, etc., comprising an azide.

In some embodiments, the compositions comprise multiple nucleic acids; accordingly, in some embodiments, compositions comprise a second nucleic acid (e.g., in addition to a nucleic acid comprising a nucleotide analog). The technology encompasses functionalized nucleic acids for reacting with a nucleic acid comprising a nucleotide analog. Thus, in some embodiments, the second nucleic acid comprises an azide moiety, e.g., in some embodiments, the second nucleic acid comprises an azide moiety at the 5′ end of the second nucleic acid.

The technology is not limited in the entity (e.g., comprising an azide group) reacted with the nucleic acid comprising the nucleotide analog. For instance, in some embodiments, compositions further comprise a label comprising an azide, a tag comprising an azide, a solid support comprising an azide, a nucleotide comprising an azide, a biotin comprising an azide, or a protein comprising an azide. In some embodiments, an alkyne moiety and an azide moiety are reacted using a “click chemistry” reaction catalyzed by a copper-based catalyst. As such, in some embodiments compositions further comprise a copper-based catalyst reagent. The reaction of the azide and alkyne produces, in some embodiments, a triazole moiety. In some embodiments, a nucleic acid comprising an alkyne (e.g., a nucleic acid comprising a nucleotide analog comprising an alkyne) is reacted with a nucleic acid comprising an azide to produce a longer nucleic acid. As such, in some embodiments compositions according to the technology further comprise a nucleic acid comprising a triazole (e.g., that forms a link between the two nucleic acids). In some embodiments, the reaction of the alkyne and azide proceeds with regioselectivity, e.g., in some embodiments the nucleic acid comprises a 1′,4′ substituted triazole. In some embodiments, the nucleic acid comprising the nucleotide analog is reacted with an adaptor oligonucleotide, an adaptor oligonucleotide comprising a barcode, or a barcode oligonucleotide comprising an azide. Thus, in some embodiments are provided reaction mixtures comprising an adaptor oligonucleotide, an adaptor oligonucleotide comprising a barcode, or a barcode oligonucleotide.

In some embodiments, a nucleic acid (e.g., formed from uniting two nucleic acids by “click chemistry” reaction of an alkyne and an azide) comprises a structure according to:

Another aspect of the technology relates to embodiments of methods for synthesizing a modified nucleic acid, the method comprising providing a nucleotide analog comprising an alkyne group and linking a nucleic acid to the nucleotide analog to produce a modified nucleic acid comprising the nucleotide analog. In some embodiments, the nucleotide analog has a structure according to:

wherein B is a base (e.g., cytosine, guanine, adenine, thymine, or uracil) and P comprises a triphosphate moiety. Embodiments of the method comprise further providing, e.g., a template, a primer, a nucleotide (e.g., a conventional nucleotide), and/or a polymerase. The nucleotide analogs are recognized as a substrate by biological enzymes such as polymerases; thus, in some embodiments, a polymerase catalyzes linking a nucleic acid to the nucleotide analog to produce a modified nucleic acid comprising the nucleotide analog. The modified nucleic acid provides a substrate for reaction with an azide-carrying entity, e.g., to form a conjugated product by a “click chemistry” reaction. Thus, in some embodiments the methods further comprise reacting the modified nucleic acid with an azide moiety. The methods are not limited in the entity that comprises the azide moiety; for example, in some embodiments the methods comprise reacting the modified nucleic acid with a second nucleic acid comprising an azide moiety, e.g., reacting the modified nucleic acid with a second nucleic acid comprising an azide moiety at the 5′ end of the second nucleic acid, a label comprising an azide, a tag comprising an azide, a solid support comprising an azide, a nucleotide comprising an azide, and/or a protein comprising an azide.

The methods find use in linking an adaptor oligonucleotide (e.g., for use in next-generation sequencing) to a nucleic acid comprising a nucleotide analog. Accordingly, in some embodiments, the methods further comprise reacting the modified nucleic acid with an adaptor oligonucleotide comprising an azide moiety, an adaptor oligonucleotide comprising a barcode and comprising an azide moiety, and/or a barcode oligonucleotide comprising an azide moiety, e.g., to produce a nucleic acid-oligonucleotide conjugate. In some embodiments, reactions of a nucleotide analog (e.g., a nucleic acid comprising a nucleotide analog) and an azide are catalyzed by a copper-based catalyst reagent. Associated methods, according, in some embodiments comprise reacting the modified nucleic acid with an azide moiety and a copper-based catalyst reagent. As the triazole ring formed by the “click chemistry” reaction does not substantially and/or detectably inhibit biological enzyme activity, the nucleic acid-oligonucleotide conjugate provides a useful nucleic acid for further manipulation, e.g., in some embodiments the modified nucleic acid is a substrate for a biological enzyme, the modified nucleic acid is a substrate for a polymerase, and/or the modified nucleic acid is a substrate for a sequencing reaction.

The nucleotide analogs provided herein are functional terminators, e.g., they act to terminate synthesis of a nucleic acid (e.g., similar to a dideoxynucleotide as used in Sanger sequencing) while also comprising a reactive group for further chemical processing. Accordingly, as described herein, in some embodiments, the methods further comprise terminating polymerization with the nucleotide analog.

Related methods provide, in some embodiments, a method for sequencing a nucleic acid, the method comprising hybridizing a primer to a nucleic acid template to form a hybridized primer/nucleic acid template complex; providing a plurality of nucleotide analogs, each nucleotide analog comprising an alkyne moiety; reacting the hybridized primer/nucleic acid template complex and the nucleotide analog with a polymerase to add the nucleotide analog to the primer by a polymerase reaction to form an extended product comprising an incorporated nucleotide analog; and reacting the extended product with an azide-containing compound to form a structure comprising a triazole ring. In particular embodiments, the nucleotide analogs are 3′-O-propargyl-dNTP nucleotide analogs and N is selected from the group consisting of A, C, G, T and U. As the triazole ring formed by the “click chemistry” reaction does not substantially and/or detectably inhibit biological enzyme activity, the nucleic acid-oligonucleotide conjugate provides a useful nucleic acid for further manipulation. Thus, in some embodiments the structure comprising a triazole ring is used in subsequent enzymatic reactions, e.g., a polymerase chain reaction and/or a sequencing reaction. Polymerization in the presence of nucleotide analogs is performed, in some embodiments, in the presence also of conventional (e.g., non-terminator) nucleotides. Related methods comprise providing conventional nucleotides.

Also provided herein are embodiments of kits. For example, in some embodiments, kits are provided for synthesizing a modified nucleic acid, the kit comprising a nucleotide analog comprising an alkynyl group; and a copper-based catalyst reagent. In some embodiments kits further comprise other components that find use in the processing and/or manipulation of nucleic acids. Thus, in some embodiments kits further comprise a polymerase, an adaptor oligonucleotide comprising an azide moiety, and/or a nucleotide (e.g., a conventional nucleotide).

In some embodiments, provided herein are compositions comprising a nucleotide analog having a structure according to:

wherein B is a base (e.g., a purine or a pyrimidine such as a cytosine, guanine, adenine, thymine, or uracil; e.g., a modified purine or a modified pyrimidine) and P comprises a phosphate moiety (e.g., a tetraphosphate; a triphosphate; a diphosphate; a monophosphate; a 5′ hydroxyl; an alpha thiophosphate (e.g., phosphorothioate or phosphorodithioate), a beta thiophosphate (e.g., phosphorothioate or phosphorodithioate), and/or a gamma thiophosphate (e.g., phosphorothioate or phosphorodithioate); or an alpha methylphosphonate, a beta methylphosphonate, and/or a gamma methylphosphonate); a nucleic acid; a polymerase; and a nucleotide (e.g., comprising the base B, e.g., in a number ratio of the nucleotide analog to the nucleotide that is 1:1, 1:2, 1:3, 1:4, 1:5, 1:10, 1:15, 1:20, 1:25, 1:30, 1:50, 1:75, 1:100, 1:200, 1:300, 1:400, 1:500, 1:600, 1:700, 1:800, 1:900, 1:1000, 1:5000, or 1:10000).

Also provided are embodiments of compositions comprising a nucleic acid (e.g., produced by a polymerase), wherein the nucleic acid comprises a nucleotide analog (e.g., at its 3′ end) having a structure according to:

wherein B is a base (e.g., a purine or a pyrimidine such as a cytosine, guanine, adenine, thymine, or uracil; e.g., a modified purine or a modified pyrimidine) and P comprises a phosphate moiety (e.g., a tetraphosphate; a triphosphate; a diphosphate; a monophosphate; a 5′ hydroxyl; an alpha thiophosphate (e.g., phosphorothioate or phosphorodithioate), a beta thiophosphate (e.g., phosphorothioate or phosphorodithioate), and/or a gamma thiophosphate (e.g., phosphorothioate or phosphorodithioate); or an alpha methylphosphonate, a beta methylphosphonate, and/or a gamma methylphosphonate); a second nucleic acid (e.g., comprising an azide, e.g., at its 5′ end), a label comprising an azide, a tag comprising an azide, a solid support comprising an azide, a nucleotide comprising an azide, a biotin comprising an azide, or a protein comprising an azide; a copper (e.g., copper-based) catalyst reagent; a nucleic acid comprising a triazole (e.g., a 1′,4′ substituted triazole); and/or a structure such as:

an adaptor oligonucleotide, an adaptor oligonucleotide comprising a barcode, or a barcode oligonucleotide;

In another aspect, the technology provides a method for synthesizing a modified nucleic acid, the method comprising providing a nucleotide analog comprising an alkyne group, e.g., a nucleotide having a structure according to:

wherein B is a base (e.g., cytosine, guanine, adenine, thymine, or uracil) and P comprises a triphosphate moiety; linking a nucleic acid to the nucleotide analog to produce a modified nucleic acid comprising the nucleotide analog; providing a template; providing a primer; providing a nucleotide; providing a polymerase (e.g., to catalyze the linking of the nucleic acid to the nucleotide analog); terminating polymerization with the nucleotide analog; reacting the modified nucleic acid with an azide moiety (e.g., with a second nucleic acid comprising an azide moiety at its 5′ end, a label comprising an azide, a tag comprising an azide, a solid support comprising an azide, a nucleotide comprising an azide, a protein comprising an azide, an adaptor oligonucleotide comprising an azide moiety, an adaptor oligonucleotide comprising a barcode and comprising an azide moiety, or a barcode oligonucleotide comprising an azide moiety), e.g., to produce a nucleic acid-oligonucleotide conjugate (e.g., that is a substrate for a biological enzyme such as a polymerase and/or to provide a substrate for a sequencing reaction); and/or reacting the modified nucleic acid with an azide moiety and a copper-based catalyst reagent.

In some embodiments are provided a method for sequencing a nucleic acid, the method comprising hybridizing a primer to a nucleic acid template to form a hybridized primer/nucleic acid template complex; providing a plurality of nucleotide analogs (e.g., 3′-O-propargyl-dNTP nucleotide analogs wherein N is selected from the group consisting of A, C, G, T, and U), each nucleotide analog comprising an alkyne moiety; providing conventional nucleotides; reacting the hybridized primer/nucleic acid template complex and the nucleotide analog with a polymerase to add the nucleotide analog to the primer by a polymerase reaction to form an extended product comprising an incorporated nucleotide analog; and reacting the extended product with an azide-containing compound to form a structure comprising a triazole ring (e.g., that is used in subsequent enzymatic reactions such as a polymerase chain reaction).

In some embodiments are provided a kit for synthesizing a modified nucleic acid, the kit comprising a nucleotide analog comprising an alkynyl group; a copper-based catalyst reagent; a polymerase; an adaptor oligonucleotide comprising an azide moiety; and a conventional nucleotide.

Additional embodiments are provided below and as variations of the technology described as understood by a person having ordinary skill in the art.

BRIEF DESCRIPTION OF THE DRAWINGS

These and other features, aspects, and advantages of the present technology will become better understood with regard to the following drawings:

FIG. 1 is a schematic showing a polymerase extension reaction using 3′-O-propargyl-dGTP. The polymerase extension halts after the incorporation of 3′-O-propargyl-dGTP, producing product 1. A 5′-azide-modified DNA fragment is chemically ligated to product 1 using click chemistry producing product 2. The covalent linkage created by the formation of the triazole ring mimics that of the natural DNA backbone phosphodiester linkage. Product 2 is used subsequently in enzymatic reactions (e.g., PCR).

FIG. 2 is a schematic showing a polymerase extension reaction using a combination of dNTPs and 3′-O-propargyl-dNTPs. DNA ladder fragments (n+1 fragments) are generated with each of the fragments' 3′-ends having an alkyne group. These DNA ladder fragments are ligated to a 5′-azide-modified DNA molecule, which has a “universal” sequence and/or a barcode sequence and/or a primer binding site, via click chemistry. The ligated DNA fragments are subsequently treated and used as input in next generation sequencing (NGS) processes. These DNA fragments with the n+1 characteristic produce DNA sequencing data by assembling short reads, thereby significantly decreasing the NGS run time.

FIG. 3 is a drawing showing a schematic of a primer extension and adaptor ligation according to some embodiments of the technology.

FIG. 4 is a drawing showing a schematic of a sequencing-related embodiment according to some embodiments of the technology.

It is to be understood that the figures are not necessarily drawn to scale, nor are the objects in the figures necessarily drawn to scale in relationship to one another. The figures are depictions that are intended to bring clarity and understanding to various embodiments of apparatuses, systems, and methods disclosed herein. Wherever possible, the same reference numbers will be used throughout the drawings to refer to the same or like parts. Moreover, it should be appreciated that the drawings are not intended to limit the scope of the present teachings in any way.

DETAILED DESCRIPTION

Provided herein is technology relating to the manipulation and detection of nucleic acids, including but not limited to compositions, methods, systems, and kits related to nucleotides comprising a chemically reactive linking moiety. In particular embodiments, the technology provides nucleotide analogs comprising a base (e.g., adenine, guanine, cytosine, thymine, or uracil), a sugar (e.g., a ribose or deoxyribose), and an alkyne chemical moiety, e.g., attached to the 3′ oxygen of the sugar (e.g., the 3′ oxygen of the deoxyribose or the 3′ oxygen of the ribose). The nucleotide analogs (e.g., a 3′-alkynyl nucleotide analog, e.g., a 3′-O-propargyl nucleotide analog such as a 3′-O-propargyl dNTP or a 3′-O-propargyl NTP) find use in embodiments of the technology to introduce a particular chemical moiety (e.g., an alkyne) at the end (e.g., the 3′ end) of a nucleic acid (e.g., a DNA or RNA) by a polymerase extension reaction, and, consequently, to produce a nucleic acid modification that does not exist in natural biological systems. Chemical ligation between the polymerase extension products and appropriate conjugation partners (e.g., azide modified entities) is achieved with high efficiency and specificity using click chemistry. Embodiments of the functional nucleotide terminators provided herein are used to produce nucleic acids that are useful for various molecular biology, biochemical, and biotechnology applications.

The section headings used herein are for organizational purposes only and are not to be construed as limiting the described subject matter in any way. In this detailed description of the various embodiments, for purposes of explanation, numerous specific details are set forth to provide a thorough understanding of the embodiments disclosed. One skilled in the art will appreciate, however, that these various embodiments may be practiced with or without these specific details. In other instances, structures and devices are shown in block diagram form. Furthermore, one skilled in the art can readily appreciate that the specific sequences in which methods are presented and performed are illustrative and it is contemplated that the sequences can be varied and still remain within the spirit and scope of the various embodiments disclosed herein.

All literature and similar materials cited in this application, including but not limited to, patents, patent applications, articles, books, treatises, and internet web pages are expressly incorporated by reference in their entirety for any purpose. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as is commonly understood by one of ordinary skill in the art to which the various embodiments described herein belongs. When definitions of terms in incorporated references appear to differ from the definitions provided in the present teachings, the definition provided in the present teachings shall control.

DEFINITIONS

To facilitate an understanding of the present technology, a number of terms and phrases are defined below. Additional definitions are set forth throughout the detailed description.

Throughout the specification and claims, the following terms take the meanings explicitly associated herein, unless the context clearly dictates otherwise. The phrase “in one embodiment” as used herein does not necessarily refer to the same embodiment, though it may. Furthermore, the phrase “in another embodiment” as used herein does not necessarily refer to a different embodiment, although it may. Thus, as described below, various embodiments of the invention may be readily combined, without departing from the scope or spirit of the invention.

In addition, as used herein, the term “or” is an inclusive “or” operator and is equivalent to the term “and/or” unless the context clearly dictates otherwise. The term “based on” is not exclusive and allows for being based on additional factors not described, unless the context clearly dictates otherwise. In addition, throughout the specification, the meaning of “a”, “an”, and “the” include plural references. The meaning of “in” includes “in” and “on.”

As used herein, a “nucleotide” comprises a “base” (alternatively, a “nucleobase” or “nitrogenous base”), a “sugar” (in particular, a five-carbon sugar, e.g., ribose or 2-deoxyribose), and a “phosphate moiety” of one or more phosphate groups (e.g., a monophosphate, a diphosphate, a triphosphate, a tetraphosphate, etc. consisting of one, two, three, four or more linked phosphates, respectively). Without the phosphate moiety, the nucleobase and the sugar compose a “nucleoside”. A nucleotide can thus also be called a nucleoside monophosphate or a nucleoside diphosphate or a nucleoside triphosphate, depending on the number of phosphate groups attached. The phosphate moiety is usually attached to the 5-carbon of the sugar, though some nucleotides comprise phosphate moieties attached to the 2-carbon or the 3-carbon of the sugar. Nucleotides contain either a purine (e.g., in the nucleotides adenine and guanine) or a pyrimidine base (e.g., in the nucleotides cytosine, thymine, and uracil). Some nucleotides contain non-natural bases. Ribonucleotides are nucleotides in which the sugar is ribose. Deoxyribonucleotides are nucleotides in which the sugar is deoxyribose.

As used herein, a “nucleic acid” shall mean any nucleic acid molecule, including, without limitation, DNA, RNA, and hybrids thereof. The nucleic acid bases that form nucleic acid molecules can be the bases A, C, G, T and U, as well as derivatives thereof. Derivatives of these bases are well known in the art. The term should be understood to include, as equivalents, analogs of either DNA or RNA made from nucleotide analogs. The term as used herein also encompasses cDNA that is complementary DNA produced from an RNA template, for example by the action of a reverse transcriptase. It is well known that DNA (deoxyribonucleic acid) is a chain of nucleotides consisting of 4 types of nucleotides—A (adenine), T (thymine), C (cytosine), and G (guanine)—and that RNA (ribonucleic acid) is a chain of nucleotides consisting of 4 types of nucleotides—A, U (uracil), G, and C. It is also known that all of these 5 types of nucleotides specifically bind to one another in combinations called complementary base pairing. That is, adenine (A) pairs with thymine (T) (in the case of RNA, however, adenine (A) pairs with uracil (U)) and cytosine (C) pairs with guanine (G), so that each of these base pairs forms a double strand. As used herein, “nucleic acid sequencing data”, “nucleic acid sequencing information”, “nucleic acid sequence”, “genomic sequence”, “genetic sequence”, “fragment sequence”, or “nucleic acid sequencing read” denotes any information or data that is indicative of the order of the nucleotide bases (e.g., adenine, guanine, cytosine, and thymine/uracil) in a molecule (e.g., a whole genome, a whole transcriptome, an exome, oligonucleotide, polynucleotide, fragment, etc.) of DNA or RNA It should be understood that the present teachings contemplate sequence information obtained using all available varieties of techniques, platforms or technologies, including, but not limited to: capillary electrophoresis, microarrays, ligation-based systems, polymerase-based systems, hybridization-based systems, direct or indirect nucleotide identification systems, pyrosequencing, ion- or pH-based detection systems, electronic signature-based systems, pore-based (e.g., nanopore), visualization-based systems, etc.

Reference to a base, a nucleotide, or to another molecule may be in the singular or plural. That is, “a base” may refer to a single molecule of that base or to a plurality of the base, e.g., in a solution.

A “polynucleotide”, “nucleic acid”, or “oligonucleotide” refers to a linear polymer of nucleosides (including deoxyribonucleosides, ribonucleosides, or analogs thereof) joined by internucleosidic linkages. Typically, a polynucleotide comprises at least three nucleosides. Usually oligonucleotides range in size from a few monomeric units, e.g. 3 to 4, to several hundreds of monomeric units. Whenever a polynucleotide such as an oligonucleotide is represented by a sequence of letters, such as “ATGCCTG”, it will be understood that the nucleotides are in 5′ to 3′ order from left to right and that “A” denotes deoxyadenosine, “C” denotes deoxycytidine, “G” denotes deoxyguanosine, and “T” denotes thymidine, unless otherwise noted. The letters A, C, G, and T may be used to refer to the bases themselves, to nucleosides, or to nucleotides comprising the bases, as is standard in the art.

As used herein, the phrase “dNTP” means deoxynucleotidetriphosphate, where the nucleotide comprises a nucleotide base, such as A, T, C, G or U.

The term “monomer” as used herein means any compound that can be incorporated into a growing molecular chain by a given polymerase. Such monomers include, without limitation, naturally occurring nucleotides (e.g., ATP, GTP, TTP, UTP, CTP, dATP, dGTP, dTTP, dUTP, dCTP, synthetic analogs), precursors for each nucleotide, non-naturally occurring nucleotides and their precursors or any other molecule that can be incorporated into a growing polymer chain by a given polymerase.

As used herein, “complementary” generally refers to specific nucleotide duplexing to form canonical Watson-Crick base pairs, as is understood by those skilled in the art. However, complementary also includes base-pairing of nucleotide analogs that are capable of universal base-pairing with A, T, G or C nucleotides and locked nucleic acids that enhance the thermal stability of duplexes. One skilled in the art will recognize that hybridization stringency is a determinant in the degree of match or mismatch in the duplex formed by hybridization.

As used herein, “moiety” refers to one of two or more parts into which something may be divided, such as, for example, the various parts of a tether, a molecule, or a probe.

As used herein, a “linker” is a molecule or moiety that joins two molecules or moieties and/or provides spacing between the two molecules or moieties such that they are able to function in their intended manner. For example, a linker can comprise a diamine hydrocarbon chain that is covalently bound through a reactive group on one end to an oligonucleotide analog molecule and through a reactive group on another end to a solid support, such as, for example, a bead surface. Coupling of linkers to nucleotides and substrate constructs of interest can be accomplished through the use of coupling reagents that are known in the art (see, e.g., Efimov et al., Nucleic Acids Res. 27: 4416-4426, 1999). Methods of derivatizing and coupling organic molecules are well known in the arts of organic and bioorganic chemistry. A linker may also be cleavable (e.g., photocleavable) or reversible.

A “polymerase” is an enzyme generally for joining 3′-OH, 5′-triphosphate nucleotides, oligomers, and their analogs. Polymerases include, but are not limited to, DNA-dependent DNA polymerases, DNA-dependent RNA polymerases, RNA-dependent DNA polymerases, RNA-dependent RNA polymerases, T7 DNA polymerase, T3 DNA polymerase, T4 DNA polymerase, T7 RNA polymerase, T3 RNA polymerase, SP6 RNA polymerase, DNA polymerase 1, Klenow fragment, Thermophilus aquaticus DNA polymerase, Tth DNA polymerase, Vent DNA polymerase (New England Biolabs), Deep Vent DNA polymerase (New England Biolabs), Bst DNA Polymerase Large Fragment, Stoeffel Fragment, 9° N DNA Polymerase, Pfu DNA Polymerase, Tfl DNA Polymerase, RepliPHI Phi29 Polymerase, Tli DNA polymerase, eukaryotic DNA polymerase beta, telomerase, Therminator polymerase (New England Biolabs), KOD HiFi. DNA polymerase (Novagen), KOD1 DNA polymerase, Q-beta replicase, terminal transferase, AMV reverse transcriptase, M-MLV reverse transcriptase, Phi6 reverse transcriptase, HIV-1 reverse transcriptase, novel polymerases discovered by bioprospecting, and polymerases cited in U.S. Pat. Appl. Pub. No. 2007/0048748 and in U.S. Pat. Nos. 6,329,178; 6,602,695; and 6,395,524. These polymerases include wild-type, mutant isoforms, and genetically engineered variants such as exo-polymerases and other mutants, e.g., that tolerate modified (e.g., labeled) nucleotides and incorporate them into a strand of nucleic acid.

The term “primer” refers to an oligonucleotide, whether occurring naturally as in a purified restriction digest or produced synthetically, that is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product that is complementary to a nucleic acid strand is induced, (e.g., in the presence of nucleotides and an inducing agent such as a polymerase and at a suitable temperature and pH). The primer is preferably single stranded for maximum efficiency in amplification, but may alternatively be double stranded. If double stranded, the primer is first treated to separate its strands before being used to prepare extension products. Preferably, the primer is an oligodeoxyribonucleotide. The primer must be sufficiently long to prime the synthesis of extension products in the presence of the inducing agent. The exact lengths of the primers depends on many factors including temperature, source of primer, and the use of the method.

As used herein, a “system” denotes a set of components, real or abstract, comprising a whole where each component interacts with or is related to at least one other component within the whole.

Various nucleic acid sequencing platforms, nucleic acid assembly and/or mapping systems (e.g., computer software and/or hardware) are described, e.g., in U.S. Pat. Appl. Pub. No. 2011/0270533, which is incorporated herein by reference.

As used herein, the terms “alkyl” and the prefix “alk-” are inclusive of both straight chain and branched chain saturated or unsaturated groups, and of cyclic groups, e.g., cycloalkyl and cycloalkenyl groups. Unless otherwise specified, acyclic alkyl groups are from 1 to 6 carbons. Cyclic groups can be monocyclic or polycyclic and preferably have from 3 to 8 ring carbon atoms. Exemplary cyclic groups include cyclopropyl, cyclopentyl, cyclohexyl, and adamantyl groups. Alkyl groups may be substituted with one or more substituents or unsubstituted. Exemplary substituents include alkoxy, aryloxy, sulfhydryl, alkylthio, arylthio, halogen, alkylsilyl, hydroxyl, fluoroalkyl, perfluoralkyl, amino, aminoalkyl, disubstituted amino, quaternary amino, hydroxyalkyl, carboxyalkyl, and carboxyl groups. When the prefix “alk” is used, the number of carbons contained in the alkyl chain is given by the range that directly precedes this term, with the number of carbons contained in the remainder of the group that includes this prefix defined elsewhere herein. For example, the term “C₁-C₄ alkaryl” exemplifies an aryl group of from 6 to 18 carbons (e.g., see below) attached to an alkyl group of from 1 to 4 carbons.

As used herein, the term “alkoxy” refers to a chemical substituent of the formula —OR, where R is an alkyl group. By “aryloxy” is meant a chemical substituent of the formula —OR′, where R′ is an aryl group.

As used herein, the term “alkyne” refers to a hydrocarbon comprising a carbon-carbon triple bond. One example of an alkyne-containing functional group is the propargyl group. Propargyl is an alkyl functional group of 2-propynyl with the structure HC≡C—CH₂—, derived from the alkyne propyne.

As used herein, the term “azide” or “azido” refers to any compound having the N₃-moiety therein. The azide may be an organic azide or a metal azide. One reaction involving azides is a type of click chemistry known as a copper (I)-catalyzed 1,3-dipolar cyclo-addition reaction. This reaction conjugates alkynes and azides to form a five-membered triazole ring that provides a covalent linkage.

As used herein, the term “backbone” refers to a structural component of a nucleic acid molecule that is a series of covalently bonded atoms that together create the continuous chain of the molecule. In “natural” nucleic acids the backbone comprises phosphodiester bonds linking alternating sugars (e.g., ribose or deoxyribose) and phosphate moieties (related to phosphoric acid).

As used herein a “target site” is a site of a subject at which it is desired for a bioactive agent to be delivered and to be active. A target site may be a cell, a cell type, a tissue, an organ, an area, or other designation of a subject's anatomy and/or physiology.

The terms “protein” and “polypeptide” refer to compounds comprising amino acids joined via peptide bonds and are used interchangeably. Conventional one and three-letter amino acid codes are used herein as follows—Alanine: Ma, A; Arginine: Arg, R; Asparagine: Asn, N; Aspartate: Asp, D; Cysteine: Cys, C; Glutamate: Glu, E; Glutamine: Gln, Q; Glycine: Gly, G; Histidine: His, H; Isoleucine: Ile, I; Leucine: Leu, L; Lysine: Lys, K; Methionine: Met, M; Phenylalanine: Phe, F; Proline: Pro, P; Serine: Ser, S; Threonine: Thr, T; Tryptophan: Trp, W; Tyrosine: Tyr, Y; Valine: Val, V. As used herein, the codes Xaa and X refer to any amino acid.

In some embodiments compounds of the technology comprise an antibody component or moiety, e.g., an antibody or fragments or derivatives thereof. As used herein, an “antibody”, also known as an “immunoglobulin” (e.g., IgG, IgM, IgA, IgD, IgE), comprises two heavy chains linked to each other by disulfide bonds and two light chains, each of which is linked to a heavy chain by a disulfide bond. The specificity of an antibody resides in the structural complementarity between the antigen combining site of the antibody (or paratope) and the antigen determinant (or epitope). Antigen combining sites are made up of residues that are primarily from the hypervariable or complementarity determining regions (CDRs). Occasionally, residues from nonhypervariable or framework regions influence the overall domain structure and hence the combining site. In some embodiments the targeting moiety is a fragment of antibody, e.g., any protein or polypeptide-containing molecule that comprises at least a portion of an immunoglobulin molecule such as to permit specific interaction between said molecule and an antigen. The portion of an immunoglobulin molecule may include, but is not limited to, at least one complementarity determining region (CDR) of a heavy or light chain or a ligand binding portion thereof, a heavy chain or light chain variable region, a heavy chain or light chain constant region, a framework region, or any portion thereof. Such fragments may be produced by enzymatic cleavage, synthetic or recombinant techniques, as known in the art and/or as described herein. Antibodies can also be produced in a variety of truncated forms using antibody genes in which one or more stop codons have been introduced upstream of the natural stop site. The various portions of antibodies can be joined together chemically by conventional techniques, or can be prepared as a contiguous protein using genetic engineering techniques.

Fragments of antibodies include, but are not limited to, Fab (e.g., by papain digestion), F(ab′)2 (e.g., by pepsin digestion), Fab′ (e.g., by pepsin digestion and partial reduction) and Fv or scFv (e.g., by molecular biology techniques) fragments.

A Fab fragment can be obtained by treating an antibody with the protease papaine. Also, the Fab may be produced by inserting DNA encoding a Fab of the antibody into a vector for prokaryotic expression system or for eukaryotic expression system, and introducing the vector into a prokaryote or eukaryote to express the Fab. A F(ab′)2 may be obtained by treating an antibody with the protease pepsin. Also, the F(ab′)2 can be produced by binding a Fab′ via a thioether bond or a disulfide bond. A Fab may be obtained by treating F(ab′)2 with a reducing agent, e.g., dithiothreitol. Also, a Fab′ can be produced by inserting DNA encoding a Fab′ fragment of the antibody into an expression vector for a prokaryote or an expression vector for a eukaryote, and introducing the vector into a prokaryote or eukaryote for its expression. A Fv fragment may be produced by restricted cleavage by pepsin, e.g., at 4° C. and pH 4.0. (a method called “cold pepsin digestion”). The Fv fragment consists of the heavy chain variable domain (VH) and the light chain variable domain (VL) held together by strong noncovalent interaction. A scFv fragment may be produced by obtaining cDNA encoding the VH and VL domains as previously described, constructing DNA encoding scFv, inserting the DNA into an expression vector for prokaryote or an expression vector for eukaryote, and then introducing the expression vector into a prokaryote or eukaryote to express the scFv.

In general, antibodies can usually be raised to any antigen, using the many conventional techniques now well known in the art. Any targeting antibody to an antigen which is found in sufficient concentration at a site in the body of a mammal which is of diagnostic or therapeutic interest can be used to make the compounds provided herein.

As used herein, the term “conjugated” refers to when one molecule or agent is physically or chemically coupled or adhered to another molecule or agent. Examples of conjugation include covalent linkage and electrostatic complexation. The terms “complexed,” “complexed with,” and “conjugated” are used interchangeably herein.

As used herein, the term “treatment” is defined as the application or administration of a therapeutic agent described herein, or identified by a method described herein, to a patient, or application or administration of the therapeutic agent to an isolated tissue or cell line from a patient, who has a disease, a symptom of disease or a predisposition toward a disease, with the purpose to cure, heal, alleviate, relieve, alter, remedy, ameliorate, improve or affect the disease, the symptoms of disease, or the predisposition toward disease.

As a result of the selection of substituents and substituent patterns, certain of the compounds of the present technology can have asymmetric centers and can occur as mixtures of stereoisomers, or as individual diastereomers, or enantiomers. All isomeric forms of these compounds, whether isolated or in mixtures, are within the scope of the present technology. Pharmaceutically acceptable salts include both the metallic (inorganic) salts and organic salts, a list of which is given in Remington's Pharmaceutical Sciences, 17th Edition, pg. 1418 (1985). It is well known to one skilled in the art that an appropriate salt form is chosen based on physical and chemical properties. As will be understood by those skilled in the art, pharmaceutically acceptable salts include, but are not limited to salts of inorganic acids such as hydrochloride, sulfate, phosphate, diphosphate, hydrobromide, and nitrate; or salts of an organic acid such as malate, maleate, fumarate, tartrate, succinate, citrate, acetate, lactate, methanesulfonate, p-toluenesulfonate or palmoate, salicylate, and stearate. Similarly pharmaceutically acceptable cations include, but are not limited to sodium, potassium, calcium, aluminum, lithium, and ammonium (especially ammonium salts with secondary amines). Also included within the scope of this technology are crystal forms, hydrates, and solvates.

Compositions according to the technology can be administered in the form of pharmaceutically acceptable salts. The term “pharmaceutically acceptable salt” refers to a salt that possesses the effectiveness of the parent compound and is not biologically or otherwise undesirable (e.g., is neither toxic nor otherwise deleterious to the recipient thereof). Suitable salts include acid addition salts that may, for example, be formed by mixing a solution of the compound of the present technology with a solution of a pharmaceutically acceptable acid such as hydrochloric acid, sulfuric acid, acetic acid, trifluoroacetic acid, or benzoic acid. Certain of the compounds employed in the present technology may carry an acidic moiety (e.g., COOH or a phenolic group), in which case suitable pharmaceutically acceptable salts thereof can include alkali metal salts (e.g., sodium or potassium salts), alkaline earth metal salts (e.g., calcium or magnesium salts), and salts formed with suitable organic ligands such as quaternary ammonium salts. Also, in the case of an acid (COOH) or alcohol group being present, pharmaceutically acceptable esters can be employed to modify the solubility or hydrolysis characteristics of the compound.

The term “administration” and variants thereof (e.g., “administering” a compound) in reference to a compound mean providing the compound or a prodrug of the compound to the individual in need of treatment or prophylaxis. When a compound of the technology or a prodrug thereof is provided in combination with one or more other active agents, “administration” and its variants are each understood to include provision of the compound or prodrug and other agents at the same time or at different times. When the agents of a combination are administered at the same time, they can be administered together in a single composition or they can be administered separately. As used herein, the term “composition” is intended to encompass a product comprising the specified ingredients in the specified amounts, as well as any product that results, directly or indirectly, from combining the specified ingredients in the specified amounts.

By “pharmaceutically acceptable” is meant that the ingredients of the pharmaceutical composition must be compatible with each other and not deleterious to the recipient thereof.

The term “subject” as used herein refers to an animal, preferably a mammal, most preferably a human, who has been the object of treatment, observation, or experiment.

The term “effective amount” as used herein means that amount of active compound or pharmaceutical agent that elicits the biological or medicinal response in a cell, tissue, organ, system, animal, or human that is being sought by a researcher, veterinarian, medical doctor, or other clinician. In some embodiments, the effective amount is a “therapeutically effective amount” for the alleviation of the symptoms of the disease or condition being treated. In some embodiments, the effective amount is a “prophylactically effective amount” for prophylaxis of the symptoms of the disease or condition being prevented. The term also includes herein the amount of active compound sufficient to inhibit the mineralocorticoid receptor and thereby elicit a response being sought (e.g., an “inhibition effective amount”). When the active compound is administered as the salt, references to the amount of active ingredient are to the free form (the non-salt form) of the compound. In some embodiments, this amount is between 1 mg and 1000 mg per day, e.g., between 1 mg and 500 mg per day (between 1 mg and 200 mg per day).

In the method of the present technology, compounds, optionally in the form of a salt, can be administered by any means that produces contact of the active agent with the agent's site of action. They can be administered by any conventional means available for use in conjunction with pharmaceuticals, either as individual therapeutic agents or in a combination of therapeutic agents. They can be administered alone, but typically are administered with a pharmaceutical carrier selected on the basis of the chosen route of administration and standard pharmaceutical practice. The compounds of the technology can, for example, be administered orally, parenterally (including subcutaneous injections, intravenous, intramuscular, intrasternal injection, or infusion techniques), by inhalation spray, or rectally, in the form of a unit dosage of a pharmaceutical composition containing an effective amount of the compound and conventional non-toxic pharmaceutically-acceptable carriers, adjuvants, and vehicles. Liquid preparations suitable for oral administration (e.g., suspensions, syrups, elixirs, and the like) can be prepared according to techniques known in the art and can employ any of the usual media such as water, glycols, oils, alcohols, and the like. Solid preparations suitable for oral administration (e.g., powders, pills, capsules, and tablets) can be prepared according to techniques known in the art and can employ such solid excipients as starches, sugars, kaolin, lubricants, binders, disintegrating agents, and the like. Parenteral compositions can be prepared according to techniques known in the art and typically employ sterile water as a carrier and optionally other ingredients, such as a solubility aid. Injectable solutions can be prepared according to methods known in the art wherein the carrier comprises a saline solution, a glucose solution, or a solution containing a mixture of saline and glucose. Further description of methods suitable for use in preparing pharmaceutical compositions for use in the present technology and of ingredients suitable for use in said compositions is provided in Remington's Pharmaceutical Sciences, 18th edition, edited by A. R. Gennaro, Mack Publishing Co., 1990. Compounds of the present technology can be made by a variety of methods depicted in the synthetic reaction schemes provided herein. The starting materials and reagents used in preparing these compounds generally are either available from commercial suppliers, such as Aldrich Chemical Co., or are prepared by methods known to those skilled in the art following procedures set forth in references such as Fieser and Fieser's Reagents for Organic Synthesis, Wiley & Sons: New York, Volumes 1-21; R. C. LaRock, Comprehensive Organic Transformations, 2nd edition Wiley-VCH, New York 1999; Comprehensive Organic Synthesis, B. Trost and I. Fleming (Eds.) vol. 1-9 Pergamon, Oxford, 1991; Comprehensive Heterocyclic Chemistry, A. R. Katritzky and C. W. Rees (Eds) Pergamon, Oxford 1984, vol. 1-9; Comprehensive Heterocyclic Chemistry II, A. R. Katritzky and C. W. Rees (Eds) Pergamon, Oxford 1996, vol. 1-11; and Organic Reactions, Wiley & Sons: New York, 1991, Volumes 1-40. The synthetic reaction schemes and examples provided herein are merely illustrative of some methods by which the compounds of the present technology can be synthesized, and various modifications to these synthetic reaction schemes can be made and will be suggested to one skilled in the art having referred to the disclosure contained in this application.

The starting materials and the intermediates of the synthetic reaction schemes can be isolated and purified if desired using conventional techniques, including but not limited to, filtration, distillation, crystallization, chromatography, and the like. Such materials can be characterized using conventional means, including physical constants and spectral data.

DESCRIPTION

The technology described herein relates to nucleotide analogs and related methods, compositions (e.g., reaction mixtures), kits, and systems for manipulating, detecting, isolating, and sequencing nucleic acids. In particular, some embodiments of the nucleotide analogs comprise an alkyne moiety that provides both terminating and linking functionalities. The technology provides advantages over conventional methods such as a lower cost and reduced complexity.

1. Nucleotide Analogs

Provided herein are analogs of nucleotides. In some embodiments, the nucleotide analogs comprise one or more alkyne terminator moieties. For example, in some embodiments the technology provides a 3′-O-blocked nucleotide analog that is a 3′-O-alkynyl nucleotide analog. In some embodiments, the 3′-O-blocked nucleotide analog is a 3′-O-propargyl nucleotide analog having a structure as shown below:

wherein B is the base of the nucleotide, e.g., adenine, guanine, cytosine, thymine, or uracil, e.g., B is one of:

or a natural or synthetic nucleobase, e.g., a modified purine such as hypoxanthine, xanthine, 7-methylguanine; a modified pyrimidine such as 5,6-dihydrouracil, 5-methylcytosine, 5-hydroxymethylcytosine; etc. and wherein P comprises a phosphate moiety (e.g., a monophosphate, a diphosphate, a triphosphate, a tetraphosphate); a 5′ hydroxyl; an alpha thiophosphate (e.g., phosphorothioate or phosphorodithioate), a beta thiophosphate (e.g., phosphorothioate or phosphorodithioate), and/or a gamma thiophosphate (e.g., phosphorothioate or phosphorodithioate); or an alpha methylphosphonate, a beta methylphosphonate, and/or a gamma methylphosphonate, as defined herein.

The nucleotide analogs are not limited to a specific phosphate group. In some embodiments, the phosphate group is a monophosphate group or a polyphosphate such as a diphosphate group, a triphosphate group, or a tetraphosphate group. In some embodiments, the phosphate group is a pyrophosphate. In some embodiments, P represents a group comprising a 5′ hydroxyl; an alpha thiophosphate (e.g., phosphorothioate or phosphorodithioate), a beta thiophosphate (e.g., phosphorothioate or phosphorodithioate), and/or a gamma thiophosphate (e.g., phosphorothioate or phosphorodithioate); or an alpha methylphosphonate, a beta methylphosphonate, and/or a gamma methylphosphonate.

Moreover, the base of the nucleotide analogs is not limited to a specific base. In some embodiments, the base is an adenine, cytosine, guanine, thymine, uracil, and analogs thereof such as, for example, acyclic bases. The nucleotide analogs are not limited to a specific sugar moiety. In some embodiments, said sugar moiety is a ribose, deoxyribose, dideoxyribose, and analogs, derivatives, and/or modifications thereof (e.g., a thiofuranose, thioribose, thiodeoxyribose, etc.). In some embodiments, the sugar moiety is an arabinose or other related carbohydrate.

In some embodiments, the nucleotide analog is a 3′-O-propargyl-dNTP where N is selected from the group consisting of A, C, G, T and U. In some embodiments, the nucleotide analogs comprise detectable labels or tags such as an optically detectable moiety (e.g., a fluorescent dye), electrochemically detectable moieties (e.g., a redox active group), a quantum dot, a chromogen, a biological image contrast agent, a drug delivery vehicle tag, etc.

The synthesis of compounds provided herein is performed as described in, e.g., Bentley et al. (2008) “Accurate whole genome sequencing using reversible terminator chemistry” Nature 456(7218): 53-9 and Ju et al. (2006) “Four Color DNA Sequencing by synthesis using cleavable fluorescent nucleotide reversible terminators,” PNAS 103(52): 19635-40, incorporated herein by reference, with the modifications as needed to provide the various nucleotide analogs described herein. Additionally, various molecular characterizations such as NMR, mass spectrometry, and chromatography/affinity analysis are used in some embodiments to confirm successful synthesis of the correct compounds.

In some embodiments, synthetic methods for compounds encompassed and contemplated by the technology described herein comprise one or more of the following synthetic schemes or modifications thereof:

Synthesis of 3′-O-propargyl dCTP

Synthesis of 3′-O-propargyl dTTP

Synthesis of 3′-O-propargyl dATP

Synthesis of 3′-O-propargyl dGTP

In some embodiments, the nucleotide analogs are used to incorporate alkyne moieties into nucleic acid polymers, e.g., by a polymerase. In some embodiments, a polymerase is modified to enhance incorporation of the nucleotide analogs disclosed herein. Exemplary modified polymerases are disclosed in U.S. Pat. Nos. 4,889,818; 5,374,553; 5,420,029; 5,455,170; 5,466,591; 5,618,711; 5,624,833; 5,674,738; 5,789,224; 5,795,762; 5,939,292; and U.S. Patent Publication Nos. 2002/0012970 and 2004/0005599. A non-limiting example of a modified polymerase includes G46E E678G CS5 DNA polymerase, G46E E678G CS5 DNA polymerase, E615G Taq DNA polymerase, ΔZO5R polymerase, and G46E L329A E678G CS5 DNA polymerase disclosed in U.S. Patent Publication No. 2005/0037398. The production of modified polymerases can be accomplished using many conventional techniques in molecular biology and recombinant DNA described herein and known in the art. In some embodiments, polymerase mutants, such as those described in U.S. Pat. No. 5,939,292, which incorporate NTPs as well as dNTPs are used.

In some embodiments the nucleotide analogs contain tags in addition to alkyne moieties (see supra). In some embodiments, nucleotide analogs with 3′ alkyne moieties are used to terminate a polymerase reaction. Chemical tags containing an azido moiety can then be appended to the nucleic acid polymer through click chemistry. In some embodiments, the reaction of the terminator alkyne compound with the azido moiety-containing compound forms a triazole compound. In some embodiments, the triazole compound functions as a nucleic acid backbone and further enzymatic reactions such as PCR are performed on the triazole compound.

2. Oligonucleotides

In some embodiments, the nucleotide analogs find use for the synthesis of triazole-backbone-modified nucleic acids (e.g., oligonucleotide analogs). For example, the nucleotide analogs find use in methods for aqueous, solid-phase oligonucleotide synthesis. Such methods thus obviate the need for, inter alia, use of organic solvents, deprotection steps, and capping steps in some conventional syntheses; in addition, aqueous methods minimize or eliminate the undesired oxidation of phosphorous in the synthesized compounds, e.g., during cycle synthesis. It is contemplated that an advantage of aqueous-phase synthesis is that it is more rapid than conventional organic-phase synthesis techniques.

In some embodiments are provided a triazole-backbone-modified oligonucleotide comprising nucleotide analogs provided herein. That is, the nucleotide analogs described herein find use in the synthesis of modified oligonucleotides comprising one or more nucleotide analogs and comprising triazole groups in the molecular backbone. In some embodiments, oligonucleotides comprise some conventional nucleotides and some nucleotide analogs in various proportions. In some embodiments, oligonucleotides comprise only nucleotide analogs and do not comprise conventional nucleotides.

Accordingly, in some embodiments are provided a nucleotide analog as described elsewhere herein, e.g., having a structure according to:

Such nucleotide analogs and variants and modified derivatives thereof (e.g., comprising a base analog or alternative sugar as described herein) provide a directional, bi-functional nucleotide analog (e.g., a directional, bi-functional polymerization agent), e.g., for the synthesis of an oligonucleotide (e.g., an oligonucleotide analog, e.g., an oligonucleotide comprising a nucleotide analog described herein). In some embodiments, the directional, bi-functional nucleotide analog provides for synthesis of an oligonucleotide in a 5′ to 3′ direction and in some embodiments the directional, bi-functional nucleotide analog provides for synthesis of an oligonucleotide in a 3′ to 5′ direction. In some embodiments, the synthesis of the oligonucleotide comprises use of propargyl moiety and a linker attached to a solid support (e.g., a linker (e.g., a carboxylate linker) that is cleavable under acidic (e.g., mildly acidic) conditions). In some embodiments, the synthesis of the oligonucleotide comprises use of a propargyl moiety and an azide linker attached to a solid support. In some embodiments, a 3′-thio-modified propargyl moiety is linked to the solid support and is cleaved with a reagent comprising silver nitrate or mercuric chloride. In some embodiments, the solid support comprises a controlled pore glass, silica, sephadex, agarose, acrylamide, latex, or polystyrene, etc., provided, in some embodiments, as microspheres.

Representative synthetic schemes for producing oligonucleotides are provided as follows:

a. 3′ to 5′ oligonucleotide synthesis using 3′-alkynyl/5′-azido nucleotide analog

In exemplary synthetic scheme a, X is a solid support, the wavy line (˜) is a cleavable linker, B₁ is a first nucleotide base, and B₂ is a second nucleotide base that may be the same or different than B₁. The first step (1) links the first nucleotide analog to the solid support (e.g., using a click chemistry reaction, e.g., using a copper-based catalyst). Then, one or more (e.g., multiple) rounds of the second step (2) (e.g., using a click chemistry reaction, e.g., using a copper-based catalyst) result in synthesis of the oligonucleotide analog, with each step adding another nucleotide analog to the growing polymer chain.

b. 5′ to 3′ oligonucleotide synthesis using 3′-alkynyl/5′-azido nucleotide analog

In exemplary synthetic scheme b, X is a solid support, the wavy line (˜) is a cleavable linker (e.g., a carboxylate linker), B₁ is a first nucleotide base, and B₂ is a second nucleotide base that may be the same or different than B₁. After reacting the first nucleotide analog with the solid support comprising a linker and reactive carboxylate moiety (e.g., to form an ester link), one or more (e.g., multiple) rounds of nucleotide addition and reaction (1) (e.g., using a click chemistry reaction, e.g., using a copper-based catalyst) result in synthesis of the oligonucleotide analog, with each step adding another nucleotide analog to the growing polymer chain.

c. 5′ to 3′ oligonucleotide synthesis using 3′-azido/5′-alkynyl nucleotide analog

In exemplary synthetic scheme c, X is a solid support, the wavy line (˜) is a cleavable linker, B₁ is a first nucleotide base, and B₂ is a second nucleotide base that may be the same or different than B₁. The first step (1) links the first nucleotide analog to the solid support (e.g., using a click chemistry reaction, e.g., using a copper-based catalyst). Then, one or more (e.g., multiple) rounds of the second step (2) (e.g., using a click chemistry reaction, e.g., using a copper-based catalyst) result in synthesis of the oligonucleotide analog, with each step adding another nucleotide analog to the growing polymer chain.

d. 3′ to 5′ oligonucleotide synthesis using 3′-azido/5′-alkynyl nucleotide analog

In exemplary synthetic scheme d, X is a solid support, the wavy line (˜) is a cleavable linker, B₁ is a first nucleotide base, and B₂ is a second nucleotide base that may be the same or different than B₁. The first step (1) links the first nucleotide analog to the solid support (e.g., using a click chemistry reaction, e.g., using a copper-based catalyst). Then, one or more (e.g., multiple) rounds of the second step (2) (e.g., using a click chemistry reaction, e.g., using a copper-based catalyst) result in synthesis of the oligonucleotide analog, with each step adding another nucleotide analog to the growing polymer chain.

In some embodiments, the oligonucleotide and/or nucleotide analog is reacted with a linker to attach the oligonucleotide and/or nucleotide analog to a solid support, e.g., a bead, a planar surface (an array), a column, etc. The term “solid support” as used herein refers to a material or group of materials having a rigid or semi-rigid surface or surfaces. In many embodiments, at least one surface of the solid support is substantially flat, although in some embodiments it may be desirable to separate regions of the solid support with, for example, wells, raised regions, pins, etched trenches, or the like. According to other embodiments, the solid support takes the form of beads, resins, gels, microspheres, or other geometric configurations. See, e.g., U.S. Pat. No. 5,744,305 and U.S. Pat. Pub. Nos. 20090149340 and 20080038559 for exemplary substrates. In some embodiments, the linker is a cleavable linker (e.g., cleavable by light, heat, chemical, or biochemical reaction).

In exemplary synthesis schemes a, b, c, and d, embodiments of methods for synthesizing an oligonucleotide comprise one or more additional steps of adding a nucleotide analog, reacting a nucleotide analog, washing away and/or otherwise removing an unincorporated nucleotide analog (e.g., after a synthesis step), cleaving a linker, isolating a synthesized oligonucleotide, purifying a synthesized oligonucleotide, and/or adding a tag or a label to the synthesized oligonucleotide.

3. Tagging and Labeling

Nucleic acid detection methodologies serve a critical role in the field of molecular diagnostics. The ability to manipulate biomolecules specifically and efficiently has been the core driving force behind many successful nucleic acid detection technologies. Among the many molecular biology techniques, the ability to label or “tag” a biomolecule of interest has been a key technology for subsequent detection and identification of the biomolecule.

Accordingly, in some embodiments the technology provides compositions, methods, systems, and kits related to tagging of biomolecules such as nucleic acids and/or nucleotides. In some embodiments, alkyne-containing nucleotides such as 3′-O-propargyl-modified nucleotides (e.g., 3′-O-propargyl dNTPs) are incorporated into a nucleic acid in a polymerase extension reaction. In some embodiments, the nucleotide analog halts the polymerase reaction. In some embodiments, the alkyne-containing nucleotide is used (e.g., without further processing and/or purification) in a tagging reaction with an azide-modified tag or labeling reagent using chemical ligation (e.g., a click chemistry reaction). The covalent linkage created using this chemistry mimics natural nucleic acid phosphodiester bonds, thus providing a conjugated product that is suitable for use in subsequent enzymatic reactions such as a polymerase chain reaction.

Labels and tags are compounds, structures, or elements that are amenable to at least one method of detection and/or isolation that allows for discrimination between different labels and/or tags. For example, labels and/or tags comprise semiconductor nanocrystals, metal compounds, peptides, antibodies, small molecules, isotopes, particles, or structures having different shapes, colors, barcodes, or diffraction patterns associated therewith or embedded therein, strings of numbers, random fragments of proteins or nucleic acids, or different isotopes.

The term “label” or “tag” are used interchangeably herein to refer to any chemical moiety attached to a nucleotide or nucleic acid, wherein the attachment may be covalent or non-covalent. Preferably, the label is detectable and renders the nucleotide or nucleic acid detectable to the practitioner of the technology. Exemplary detectable labels that find use with the technology provided herein include, for example, a fluorescent label, a chemiluminescent label, a quencher, a radioactive label, biotin, and gold, or combinations thereof. Detectable labels include luminescent molecules, fluorochromes, fluorescent quenching agents, colored molecules (e.g., chromogens used for in situ hybridization (ISH, FISH) and bright field imaging applications), radioisotopes, or scintillants. Detectable labels also include any useful linker molecule (such as biotin, avidin, digoxigenin, streptavidin, HRP, protein A, protein G, antibodies or fragments thereof, Grb2, polyhistidine, Ni²⁺, FLAG tags, myc tags), heavy metals, enzymes (examples include alkaline phosphatase, peroxidase, and luciferase), electron donors/acceptors, acridinium esters, dyes, and calorimetric substrates. It is also envisioned that a change in mass may be considered a detectable label, e.g., as finds use in surface plasmon resonance detection.

The technology also finds use in applications such as linking DNA-containing alkynes to an image contrast agent (e.g., meglumines, ferumoxsil, ferumoxides, gadodiamide, gadoversetamide, gallium compounds, indium compounds, thallium compounds, rubidium compounds, technetium compounds, iopamidol, etc.), e.g., for biomedical imaging (e.g., magnetic resonance imaging (MRI), computed tomography (CT) scanning, X-ray, etc.), coupling DNA to oligo and/or antisense drug-delivery vehicle tags (e.g., steroids, lipids, cholesterol, vitamins, hormones, carbohydrates, and/or receptor-specific ligands (e.g., folate, nicotinamide, acetylcholine, GABA, glutamate, serotonin, etc.), and coupling to chromogen moieties for in situ hybridization applications. The skilled artisan would readily recognize useful detectable labels that are not mentioned above, which may be employed in the operation of the present invention.

As such, the technology is not limited in the label or tag that is linked to the nucleic acid, e.g., by use of an azide labeling reagent in a click chemistry reaction. Thus, in some embodiments, the label comprises a fluorescently detectable moiety that is based on a dye, wherein the dye is a xanthene, fluorescein, rhodamine, BODIPY, cyanine, coumarin, pyrene, phthalocyanine, phycobiliprotein, ALEXA FLUOR® 350, ALEXA FLUOR® 405, ALEXA FLUOR® 430, ALEXA FLUOR® 488, ALEXA FLUOR® 514, ALEXA FLUOR® 532, ALEXA FLUOR® 546, ALEXA FLUOR® 555, ALEXA FLUOR® 568, ALEXA FLUOR® 568, ALEXA FLUOR® 594, ALEXA FLUOR® 610, ALEXA FLUOR® 633, ALEXA FLUOR® 647, ALEXA FLUOR® 660, ALEXA FLUOR® 680, ALEXA FLUOR® 700, ALEXA FLUOR® 750, a fluorescent semiconductor crystal, or a squaraine dye. In some embodiments, the tag or label comprises a radioisotope, a spin label, a quantum dot, or a bioluminescent moiety. In some embodiments, the label is a fluorescently detectable moiety as described in, e.g., Haugland (September 2005) MOLECULAR PROBES HANDBOOK OF FLUORESCENT PROBES AND RESEARCH CHEMICALS (10th ed.), which is herein incorporated by reference in its entirety.

In some embodiments the label (e.g., a fluorescently detectable label) is one available from ATTO-TEC GmbH (Am Eichenhang 50, 57076 Siegen, Germany), e.g., as described in U.S. Pat. Appl. Pub. Nos. 20110223677, 20110190486, 20110172420, 20060179585, and 20030003486; and in U.S. Pat. No. 7,935,822, all of which are incorporated herein by reference.

In some embodiments, the nucleic acid and/or nucleotide comprising a modified nucleotide, e.g., comprising an alkyne group, is tagged with a moiety that provides for detection and/or isolation of the nucleic acid and/or nucleotide by specific interaction with a second moiety. For example, in some embodiments, the nucleic acid and/or nucleotide is linked (e.g., by a click chemistry reaction) to a tag comprising an azide and a biotin moiety, an epitope, an antigen, an aptamer, an affinity tag, a histidine tag, a barcode oligonucleotide, a poly-A tail, a capture oligonucleotide, a protein, a sugar, a chelator, a mass tag (e.g., 2-nitro-methyl-benzyl group, a 2-nitro-methyl-3-fluorobenzyl group, a 2-nitro-α-methyl-3,4-difluorobenzyl group, a 2-nitro-α-methyl-3,4-dimethoxybenzyl group, a 2-nitro-α-methyl-benzyl group, a 2-nitro-α-methyl-3-fluorobenzyl group, a 2-nitro-methyl-3,4-difluorobenzyl group, a 2-nitro-α-methyl-3,4-dimethoxybenzyl), a charge tag.

In some embodiments, the nucleic acid and/or nucleotide comprising an alkyne is reacted with a linker comprising an azide to attach the nucleic acid and/or nucleotide to a solid support, e.g., a bead, a planar surface (an array), a column, etc. The term “solid support” as used herein refers to a material or group of materials having a rigid or semi-rigid surface or surfaces. In many embodiments, at least one surface of the solid support is substantially flat, although in some embodiments it may be desirable to separate regions of the solid support with, for example, wells, raised regions, pins, etched trenches, or the like. According to other embodiments, the solid support takes the form of beads, resins, gels, microspheres, or other geometric configurations. See, e.g., U.S. Pat. No. 5,744,305 and U.S. Pat. Pub. Nos. 20090149340 and 20080038559 for exemplary substrates.

In some embodiments, the linker is a cleavable linker (e.g., cleavable by light, heat, chemical, or biochemical reaction).

4. Reactions

In some embodiments, the technology finds use in linking an oligonucleotide to a nucleic acid (e.g., a DNA, an RNA). For example, in some embodiments, a nucleic acid comprising a nucleotide analog (e.g., a nucleic acid comprising an alkyne group, e.g., a 3′-O-propargyl nucleotide, e.g., a 3′-O-propargyl dNTP) is linked to an oligonucleotide comprising a group (e.g., an azide group) that is chemically reactive with the chemical moiety of the nucleotide analog, e.g., by a click chemistry reaction. In some embodiments, the oligonucleotide is single-stranded and in some embodiments the oligonucleotide is double-stranded. In some embodiments the nucleic acid is a DNA and in some embodiments the nucleic acid is an RNA; in some embodiments the oligonucleotide is a DNA and in some embodiments the oligonucleotide is an RNA.

In some embodiments, methods of the technology involve attaching an adaptor to a nucleic acid. In some embodiments an adaptor comprises a functional moiety for chemical ligation to a nucleotide analog. For example, in some embodiments an adaptor comprises an azide group (e.g., at the 5′ end) that is reactive with an alkynyl group (e.g., a propargyl group, e.g., at the 3′ end of a nucleic acid comprising the nucleotide analog), e.g., by a click chemistry reaction (e.g., using a copper (e.g., a copper-based) catalyst reagent).

In some embodiments the alkyne is a butargyl group or a structural derivative thereof. In some embodiments the alkyne comprises a sulfur atom, e.g., to provide a thio-alkynyl, a thio-propargyl (e.g. 3′-S-propargyl) group, or a structural derivative thereof.

In some embodiments, the adaptors comprise a universal sequence and/or an index, e.g., a barcode nucleotide sequence. Additionally, adaptors can contain one or more of a variety of sequence elements, including but not limited to, one or more amplification primer annealing sequences or complements thereof, one or more sequencing primer annealing sequences or complements thereof, one or more barcode sequences, one or more common sequences shared among multiple different adaptors or subsets of different adaptors (e.g., a universal sequence), one or more restriction enzyme recognition sites, one or more overhangs complementary to one or more target polynucleotide overhangs, one or more probe binding sites (e.g. for attachment to a sequencing platform, such as a flow cell for massive parallel sequencing, such as developed by Illumina, Inc.), one or more random or near-random sequences (e.g. one or more nucleotides selected at random from a set of two or more different nucleotides at one or more positions, with each of the different nucleotides selected at one or more positions represented in a pool of adaptors comprising the random sequence), and combinations thereof. Two or more sequence elements can be non-adjacent to one another (e.g. separated by one or more nucleotides), adjacent to one another, partially overlapping, or completely overlapping. For example, an amplification primer annealing sequence can also serve as a sequencing primer annealing sequence. Sequence elements can be located at or near the 3′ end, at or near the 5′ end, or in the interior of the adaptor oligonucleotide. When an adaptor oligonucleotide is capable of forming secondary structure, such as a hairpin, sequence elements can be located partially or completely outside the secondary structure, partially or completely inside the secondary structure, or in between sequences participating in the secondary structure. For example, when an adaptor oligonucleotide comprises a hairpin structure, sequence elements can be located partially or completely inside or outside the hybridizable sequences (the “stem”), including in the sequence between the hybridizable sequences (the “loop”). In some embodiments, the adaptor oligonucleotides in a plurality of adaptor oligonucleotides having different barcode sequences comprise a sequence element common among all adaptor oligonucleotides in the plurality. A difference in sequence elements can be any such that at least a portion of different adaptors do not completely align, for example, due to changes in sequence length, deletion or insertion of one or more nucleotides, or a change in the nucleotide composition at one or more nucleotide positions (such as a base change or base modification). In some embodiments, an adaptor oligonucleotide comprises a 5′ overhang, a 3′ overhang, or both that is complementary to one or more target polynucleotides. Complementary overhangs can be one or more nucleotides in length, including but not limited to 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or more nucleotides in length. Complementary overhangs may comprise a fixed sequence. Complementary overhangs may comprise a random sequence of one or more nucleotides, such that one or more nucleotides are selected at random from a set of two or more different nucleotides at one or more positions, with each of the different nucleotides selected at one or more positions represented in a pool of adaptors with complementary overhangs comprising the random sequence. In some embodiments, an adaptor overhang is complementary to a target polynucleotide overhang produced by restriction endonuclease digestion. In some embodiments, an adaptor overhang consists of an adenine or a thymine.

In some embodiments, the adaptor sequences contain a molecular binding site identification element to facilitate identification and isolation of the target nucleic acid for downstream applications. Molecular binding as an affinity mechanism allows for the interaction between two molecules to result in a stable association complex. Molecules that can participate in molecular binding reactions include proteins, nucleic acids, carbohydrates, lipids, and small organic molecules such as ligands, peptides, or drugs.

When a nucleic acid molecular binding site is used as part of the adaptor, it can be used to employ selective hybridization to isolate a target sequence. Selective hybridization may restrict substantial hybridization to target nucleic acids containing the adaptor with the molecular binding site and capture nucleic acids, which are sufficiently complementary to the molecular binding site. Thus, through “selective hybridization” one can detect the presence of the target polynucleotide in an impure sample containing a pool of many nucleic acids. An example of a nucleotide-nucleotide selective hybridization isolation system comprises a system with several capture nucleotides, which are complementary sequences to the molecular binding identification elements, and are optionally immobilized to a solid support. In other embodiments, the capture polynucleotides are complementary to the target sequences itself or a barcode or unique tag contained within the adaptor. The capture polynucleotides can be immobilized to various solid supports, such as inside of a well of a plate, mono-dispersed spheres, microarrays, or any other suitable support surface known in the art. The hybridized complementary adaptor polynucleotides attached on the solid support can be isolated by washing away the undesirable non-binding nucleic acids, leaving the desirable target polynucleotides behind. If complementary adaptor molecules are fixed to paramagnetic spheres or similar bead technology for isolation, then spheres can then be mixed in a tube together with the target polynucleotide containing the adaptors. When the adaptor sequences have been hybridized with the complementary sequences fixed to the spheres, undesirable molecules can be washed away while spheres are kept in the tube with a magnet or similar agent. The desired target molecules can be subsequently released by increasing the temperature, changing the pH, or by using any other suitable elution method known in the art.

As used herein, a “barcode” or “barcode oligonucleotide” is a known nucleic acid sequence that allows some feature of a nucleic acid with which the barcode is associated to be identified. In some embodiments, the feature of the nucleic acid to be identified is the sample or source from which the nucleic acid is derived. The barcode sequence generally includes certain features that make the sequence useful, e.g., in sequencing reactions. For example, the barcode sequences are designed to have minimal or no homopolymer regions, e.g., 2 or more of the same base in a row such as AA or CCC, within the barcode sequence. In some embodiments, the barcode sequences are also designed so that they are at least one edit distance away from the base addition order when performing a manipulation or molecular biological process, such as base-by-base sequencing, ensuring that the first and last bases do not match the expected bases of the sequence.

In some embodiments, the barcode sequences are designed such that each sequence is correlated to a particular nucleic acid. Methods of designing sets of barcode sequences are shown, for example, in U.S. Pat. No. 6,235,475, the contents of which are incorporated by reference herein in their entirety. In some embodiments, barcode sequences range from about 5 nucleotides to about 15 nucleotides. In a particular embodiment, the barcode sequences range from about 4 nucleotides to about 7 nucleotides. In some embodiments, lengths and sequences of barcode sequences are designed to achieve a desired level of accuracy of determining the identity of a nucleic acid. For example, in some embodiments barcode sequences are designed such that after a tolerable number of point mutations, the identity of the associated nucleic acid is deduced with a desired accuracy. In some embodiments, a Tn-5 transposase (commercially available from Epicentre Biotechnologies; Madison, Wis.) cuts a nucleic acid into fragments and inserts short pieces of DNA into the cuts. The short pieces of DNA are used to incorporate the barcode sequences.

Methods for designing sets of barcode sequences and other methods for attaching adaptors (e.g., comprising barcode sequences) are shown in U.S. Pat. Nos. 6,138,077; 6,352,828; 5,636,400; 6,172,214; 6235,475; 7,393,665; 7,544,473; 5,846,719; 5,695,934; 5,604,097; 6,150,516; RE39,793; 7,537,897; 6172,218; and 5,863,722, the content of each of which is incorporated by reference herein in its entirety.

With appropriate changes to reaction schemes, use of 5′ alkynyl/3′ azido and 5′ azido/3′ alkynyl nucleotide analogs are contemplated to be interchangeable in reactions with the appropriate reactive substrates for linking to the 5′ and/or 3′ ends of nucleotide analogs, e.g., by click chemistry.

In some embodiments, the technology finds use in a primer extension reaction (see, e.g., FIG. 3) and/or adaptor ligation (see, e.g., FIG. 3). In particular embodiments, a primer annealed to a template (e.g., a target nucleic acid) is extended by a polymerase, which adds a nucleotide analog to the primer. While FIG. 3 shows the exemplary addition of a G-containing nucleotide analog across from the C base in the template, the primer extension technology is not limited in the bases that are added. Then, in some embodiments, an azide-modified DNA (e.g., an adaptor, e.g., an adaptor comprising a primer binding site and/or a barcode) is ligated to the primer extension product (e.g., by click chemistry). The ligation product comprises a linkage that mimics the conventional nucleic acid backbone, e.g., a triazole, and that is biocompatible with downstream enzymatic and/or chemical reactions, e.g., PCR (e.g., see FIG. 3).

5. Sequencing

In some embodiments, the nucleotide analogs find use in nucleic acid sequencing, e.g., “next generation sequencing” (NGS). For example, DNA sequencing-by-synthesis (SBS) involves determining DNA sequence by detecting certain signals (e.g., pyrophosphate groups) that are generated when a nucleotide is incorporated by a polymerase reaction. Other SBS methods involve alternate means of detecting the polymerase addition of nucleotides such as detection of light emission, change in fluorescence, chance in pH, or some other physical or chemical change. For example, Illumina's reversible terminator sequencing relies upon dye-containing reversible terminator bases. When one such base is added to the growing nucleic acid polymer, the reaction is halted and the dye on the terminal nucleic acid is detectable. The terminator-containing molecule can then be treated with a cleavage enzyme that reverses the termination and allows for the addition of additional nucleotides. This step-wise process is an improvement on earlier technology, but the extra cleavage step and subsequent sample purification leave room for further improvement.

In some embodiments, the present invention provides functional terminator nucleotides containing 3′-alkynes that are incorporated into a growing nucleic acid polymer and terminate the extension reaction. The 3′-alkyne can be immediately used in a reaction with an azide-modified tag through click chemistry. The linkage created through click chemistry mimics a natural nucleic acid phosphodiester bond and provides for the use of the conjugated product in subsequent enzymatic reactions such as PCR. In this way, some embodiments of the present invention eschew the terminator cleavage step of the reversible terminator sequencing reaction and thereby decrease the run time of the reaction (see, e.g., the embodiment depicted in FIG. 4).

In some embodiments, a nucleotide analog, e.g., a 3′-alkynyl nucleotide analog (e.g., a 3′-O-propargyl nucleotide analog such as a 3′-O-propargyl dNTP) is used in a polymerase reaction and nucleic acid extension products are made in which the 3′ end comprises an alkyne group. The alkyne-modified nucleic acid products can subsequently be used as a specific substrate in chemical ligation reactions with compounds containing azido moieties through click chemistry (e.g., a copper(I)-catalyzed 1,3-dipolar cyclo-addition reaction). This type of click chemistry conjugates alkynes and azides, forming a covalent linkage (e.g., a five-membered triazole ring) between the alkyne-containing compound and the azide-containing compound. For example, a 5′-azide-modified DNA fragment can be chemically ligated to a 3′-alkyne-modified DNA fragment using click chemistry. This conjugated DNA product can then be used as input in subsequent enzymatic reactions such as PCR or sequenceing because the covalent linkage created by the five-membered triazole ring mimics the natural phosphodiester bond of the DNA backbone and does not significantly and/or detectably inhibit subsequent enzymatic activities.

The contemplated reactions involving the nucleotide analogs provide multiple potential detection events. In some embodiments, the nucleotide analog incorporates a specific fluorophore into the elongating nucleic acid strand. In some embodiments, the addition of the nucleotide analog creates a detectable signal such as pyrophosphate. In some embodiments, the incorporation of the nucleotide analog can be detected by emission of light, change in fluorescence, change in pH, change in conformation, or some other chemical change. In some embodiments the click chemistry reaction between the incorporated nucleotide analog and a compound comprising an azido moiety can be detected in ways similar to the incorporation of the nucleotide analog.

Because of the click chemistry, the alkyne-containing nucleotide analogs readily react with compounds containing azido moieties. Using this click chemistry, various tags can be inserted covalently into an elongating nucleic acid strand that contains one of the nucleotide analogs. Examples of such tags include but are not limited to fluorescent dyes, DNA, RNA, oligonucleotides, nucleosides, proteins, amino acids, polypeptides, polysaccharides, nucleic acid, synthetic polymers, and viruses.

The technology relates in some embodiments to methods for sequencing a nucleic acid. In some embodiments, sequencing is performed by the following sequence of events with the exemplary use of a nucleotide analog comprising a 3′-O-propargyl moiety. First, the nucleotide analog is oriented in the polymerase active site (e.g., by a polymerase) to be base-paired to a complementary base of the template strand and to be adjacent to the free 3′ hydroxyl of the growing synthesized strand. Next, the nucleotide analog is added to the 3′ end of a growing strand by the polymerase, e.g., by the enzyme-catalyzed attack of the 3′ hydroxyl on the alpha-phosphate of the nucleotide analog. Further extension of the strand by the polymerase is blocked by the 3′-O-propargyl terminating group on the incorporated nucleotide analog. In some embodiments, the strand is then subjected to a PCR reaction and used in various sequencing methods.

In some embodiments, the 3′-O-propargyl terminating moiety is treated with an azide-tagged DNA molecule. This removes the terminator alkyne. Once the terminator has been removed the growing strand is free for further polymerization: the next base is incorporated to continue another cycle, e.g., a nucleotide analog is oriented in the polymerase active site, the nucleotide analog is added to the 3′ end of the growing strand by the polymerase, and the nucleotide analog is queried to identify the base added.

Some embodiments relate to parallel (e.g., massively parallel) sequencing.

In some embodiments, the technology described herein is related to a method for sequencing nucleic acid comprising: hybridizing a primer to a nucleic acid to form a hybridized primer/nucleic acid complex, providing a plurality of nucleotide analogs, each nucleotide analog comprising a nucleotide and an alkyne moiety attached to the nucleotide, reacting the hybridized primer/nucleic acid complex and the nucleotide analog with a polymerase to add the nucleotide analog to the primer by a polymerase reaction to form an extended product comprising an incorporated nucleotide analog, querying the extended product to identify the incorporated nucleotide analog, reacting the extended product with an azide-containing compound to form a structure comprising a triazole ring. In some embodiments the nucleotide analogs comprise 3′-O-propargyl-dNTP where N is selected from the group consisting of A, C, G, T and U. In some embodiments, the nucleic acid conjugate comprising a triazole ring is used in subsequent enzymatic reactions such as polymerase chain reaction. In some embodiments, the method includes providing conventional nucleotides during the same step that the nucleotide analogs are provided.

In some embodiments, the technology described herein provides a method for sequencing a nucleic acid comprising: hybridizing a primer to a nucleic acid to form a hybridized primer/nucleic acid complex, providing a plurality of nucleotides (some of which are nucleotide analogs comprising a nucleotide and an alkyne moiety attached to the nucleotide), reacting the hybridized primer/nucleic acid complex and the nucleotide analog with a polymerase to add the nucleotide analog to the primer by a polymerase reaction to form an extended product comprising an incorporated nucleotide analog, and querying the structure comprising a triazole ring to identify which analog nucleotide was incorporated. In some embodiments, the methods further comprise reacting the extended product with an azide-containing compound to form a structure comprising a triazole ring. In some embodiments the nucleotide analogs comprise 3′-O-propargyl-dNTP where N is selected from the group consisting of A, C, G, T and U. In some embodiments, the structure comprising a triazole ring is used in subsequent enzymatic reactions such as polymerase chain reaction. In some embodiments, the method includes providing conventional nucleotides during the same step that the nucleotide analogs are provided.

6. Uses

The nucleotide analogs provided herein find use in a wide range of applications. Non-limiting examples of uses for the nucleotide analogs described include use as antiviral and/or anticancer agents. In some embodiments, the nucleotide analogs provided herein find use in diagnostic medical imaging, e.g., as contrast agents for use in, e.g., MRI, computed tomography (CT) scans, X-ray imaging, angiography (e.g., venography, digital subtraction angiography (DSA), arteriography), intravenous urography, intravenous pyelography, myelography, interventional medicine (e.g., angioplasty (e.g., percutaneous transluminal angioplasty), artery ablation and/or occlusion (e.g., to treat cancer and/or vascular abnormalities), and placement of stents), arthrography, sialography, retrograde choledocho-pancreatography, micturating cystography, etc. Additional illustrative and non-limiting uses for such contrast agents include in vivo imaging for human diagnostics, drug discovery, and drug development in model systems (mouse models, etc.).

In some embodiments, an oligonucleotide comprising one or more nucleotide analogs described herein finds use in a nanoconjugate (e.g., comprising nanoparticles such as titanium dioxide nanoparticles, an oligonucleotide (e.g., comprising a nucleotide analog), and/or a contrast agent (e.g., a heavy metal contrast agent such as gadolinium)) for use in imaging and/or therapy (e.g., neutron-capture cancer therapy). See, e.g., Paunesku et al. Nanomedicine 4(3): 201-7, 2008.

In some embodiments, the technology finds use as a drug delivery tag, e.g., for the targeted cellular delivery of oligonucleotide and antisense therapeutics (e.g., siRNA, miRNA, etc.). In some embodiments, the technology finds use for the delivery of drugs linked to a nucleic acid comprising a nucleotide analog, wherein the nucleic acid serves as a targeting moiety. In some embodiments, the technology comprises use of a cell targeting moiety to direct and/or deliver an oligonucleotide to a particular cell, tissue, organ, etc. The cell targeting moiety imbues compounds (e.g., an oligonucleotide (e.g., oligonucleotide analog) according to the technology described herein linked to a cell-targeting/drug delivery moiety, e.g., as described below) with characteristics such that the compounds and/or oligonucleotides are preferably recognized, bound, imported, processed, activated, etc. by one or more target cell types relative to one or more other non-target cell types. For example, endothelial cells have a high affinity for the peptide targeting moiety Arg-Gly-Asp (RGD), cancer and kidney cells preferentially interact with compounds having a folic acid moiety, immune cells have an affinity for mannose, and cardiomyocytes have an affinity for the peptide CWLSEAGPVVTVRALRGTGSW (see, e.g., Biomaterials 31: 8081-8087, 2010). Other targeting/delivery moieties are known in the art. Accordingly, compounds comprising a targeting moiety preferentially interact with and are taken up by the targeted cell type.

In some embodiments, the compounds comprise an RGD peptide. RGD peptides comprise 4 to 30 (e.g., 5 to 20 or 5 to 15) amino acids and target tumor cells (e.g., endothelial tumor cells). Such peptides and agents derived therefrom are known in the art, and are described by Beer et al. in Methods Mol. Biol. 680: 183-200 (2011) and in Theranostics 1: 48-57 (2011); by Morrison et al. in Theranostics 1: 149-153 (2011); by Zhou et al. in Theranostics 1: 58-82 (2011); and by Auzzas et al. in Curr. Med. Chem. 17: 1255-1299 (2010).

In some embodiments, the targeting moiety is folic acid, e.g., for targeting to cells expressing the folate receptor. The folate receptor is overexpressed on the cell surfaces of human cancer cells in, e.g., cancers of the brain, kidney, lung, ovary, and breast relative to lower levels in normal cells (see, e.g., Sudimack J, et al. 2000 “Targeted drug delivery via the folate receptor” Adv Drug Deliv Rev 41: 147-162).

In some embodiments, the targeting moiety comprises transferrin, which targets the compounds to, e.g., macrophages, erythroid precursors in bone marrow, and cancer cells. When a transferrin protein encounters a transferrin receptor on the surface of a cell, the transferrin receptor binds to the transferrin and transports the transferrin into the cell. Drugs and other compounds and/or moieties linked to the tranferrin are also transported to the cell and, in some cases, imported into the cells. In some embodiments, a fragment of a transferrin targets the compounds of the technology to the target cell. See, e.g., Qian et al. (2002) “Targeted drug delivery via the transferrin receptor-mediated endocytosis pathway”, Pharmacol. Rev. 54: 561-87; Daniels et al. (2006) “The transferrin receptor part 1: Biology and targeting with cytotoxic antibodies for the treatment of cancer”, Clin. Immunol. 121: 144-58.

In some embodiments, the targeting moiety comprises the peptide VHSPNKK. This peptide targets compounds to cells expressing vascular cell adhesion molecule 1 (VCAM-1), e.g., to activated endothelial cells. Targeting activated endothelial cells finds use, e.g., in delivery of therapeutic agents to cells for treatment of inflammation and cancer. Certain melanoma cells use VCAM-1 to adhere to the endothelium and VCAM-1 participates in monocyte recruitment to atherosclerotic sites. Accordingly, the peptide VHSPNKK finds use in targeting compounds of the present technology to cancer (e.g., melanoma) cells and atherosclerotic sites.

See, e.g., Lochmann, et al. (2004) “Drug delivery of oligonucleotides by peptides” Eur. J. Pharmaceutics and Biopharmaceutics 58: 237-251, incorporated herein by reference, discussing targeting moieties and the cells targeted by those moieties.

In some embodiments, the cell-targeting moiety comprises an antibody, or derivative or fragment thereof. Antibodies to cell-specific molecules such as, e.g., proteins (e.g., cell-surface proteins, membrane proteins, proteoglycans, glycoproteins, peptides, and the like); polynucleotides (nucleic acids, nucleotides); lipids (e.g., phospholipids, glycolipids, and the like), or fragments thereof comprising an epitope or antigen specifically recognized by the antibody, target compounds according to the technology to the cells expressing the cell-specific molecules.

For example, many antibodies and antibody fragments specifically bind markers produced by or associated with tumors or infectious lesions, including viral, bacterial, fungal, and parasitic infections, and antigens and products associated with such microorganisms (see, e.g., U.S. Pat. Nos. 3,927,193; 4,331,647; 4,348,376; 4,361,544; 4,468,457; 4,444,744; 4,460,459; 4,460,561; 4,818,709; and 4,624,846, incorporated herein by reference) Moreover, antibodies that target myocardial infarctions are disclosed in, e.g., U.S. Pat. No. 4,036,945. Antibodies that target normal tissues or organs are disclosed in, e.g., U.S. Pat. No. 4,735,210. Anti-fibrin antibodies are known in the art, as are antibodies that bind to atherosclerotic plaque and to lymphocyte autoreactive clones.

For cancer (e.g., breast cancer) and its metastases, a specific marker or markers may be chosen from cell surface markers such as, for example, members of the MUC-type mucin family, an epithelial growth factor (EGFR) receptor, a carcinoembryonic antigen (CEA), a human carcinoma antigen, a vascular endothelial growth factor (VEGF) antigen, a melanoma antigen (MAGE) gene, family antigen, a T/Tn antigen, a hormone receptor, growth factor receptors, a cluster designation/differentiation (CD) antigen, a tumor suppressor gene, a cell cycle regulator, an oncogene, an oncogene receptor, a proliferation marker, an adhesion molecule, a proteinase involved in degradation of extracellular matrix, a malignant transformation related factor, an apoptosis related factor, a human carcinoma antigen, glycoprotein antigens, DF3, 4F2, MGFM antigens, breast tumor antigen CA 15-3, calponin, cathepsin, CD 31 antigen, proliferating cell nuclear antigen 10 (PC 10), and pS2. For other forms of cancer and their metastases, a specific marker or markers may be selected from cell surface markers such as, for example, vascular endothelial growth factor receptor (VEGFR) family, a member of carcinoembryonic antigen (CEA) family, a type of anti-idiotypic mAB, a type of ganglioside mimic, a member of cluster designation differentiation antigens, a member of epidermal growth factor receptor (EGFR) family, a type of a cellular adhesion molecule, a member of MUC-type mucin family, a type of cancer antigen (CA), a type of a matrix metalloproteinase, a type of glycoprotein antigen, a type of melanoma associated antigen (MAA), a proteolytic enzyme, a calmodulin, a member of tumor necrosis factor (TNF) receptor family, a type of angiogenesis marker, a melanoma antigen recognized by T cells (MART) antigen, a member of melanoma antigen encoding gene (MAGE) family, a prostate membrane specific antigen (PMSA), a small cell lung carcinoma antigen (SCLCA), a T/Tn antigen, a hormone receptor, a tumor suppressor gene antigen, a cell cycle regulator antigen, an oncogene antigen, an oncogene receptor antigen, a proliferation marker, a proteinase involved in degradation of extracellular matrix, a malignant transformation related factor, an apoptosis-related factor, and a type of human carcinoma antigen.

The antibody may have an affinity for a target associated with a disease of the immune system such as, for example, a protein, a cytokine, a chemokine, an infectious organism, and the like. In another embodiment, the antibody may be targeted to a predetermined target associated with a pathogen-borne condition. The particular target and the antibody may be specific to, but not limited to, the type of the pathogen-borne condition. A pathogen is defined as any disease-producing agent such as, for example, a bacterium, a virus, a microorganism, a fungus, a prion, and a parasite. The antibody may have an affinity for the pathogen or pathogen associated matter. The antibody may have an affinity for a cell marker or markers associated with a pathogen-borne condition. The marker or markers may be selected such that they represent a viable target on infected cells. For a pathogen-borne condition, the antibody may be selected to target the pathogen itself. For a bacterial condition, a predetermined target may be the bacterium itself, for example, Escherichia coli or Bacillus anthracis. For a viral condition, a predetermined target may be the virus itself, for example, Cytomegalovirus (CMV), Epstein-Barr virus (EBV), a hepatitis virus, such as Hepatitis B virus, human immunodeficiency virus, such as HIV, HIV-1, or HIV-2, or a herpes virus, such as Herpes virus 6. For a parasitic condition, a predetermined target may be the parasite itself, for example, Trypanasoma cruzi, Kinetoplastid, Schistosoma mansoni, Schistosoma japonicum, or Schistosoma brucei. For a fungal condition, a predetermined target may be the fungus itself, for example, Aspergillus, Candida, Cryptococcus neoformans, or Rhizomucor.

In another embodiment, the antibody may be targeted to a predetermined target associated with an undesirable target. The particular target and antibody may be specific to, but not limited to, the type of the undesirable target. An undesirable target is a target that may be associated with a disease or an undesirable condition, but also present in the normal condition. For example, the target may be present at elevated concentrations or otherwise be altered in the disease or undesirable state. Antibody may have an affinity for the undesirable target or for biological molecular pathways related to the undesirable target. Antibody may have an affinity for a cell marker or markers associated with the undesirable target. For an undesirable target, the choice of a predetermined target may be important to therapy utilizing the compounds according to the present technology (e.g., the drug and/or therapeutic moieties). The antibody may be selected to target biological matter associated with a disease or undesirable condition. For arteriosclerosis, a predetermined target may be, for example, apolipoprotein B on low density lipoprotein (LDL). For obesity, a predetermined marker or markers may be chosen from cell surface markers such as, for example, one of gastric inhibitory polypeptide receptor and CD36 antigen. Another undesirable predetermined target may be clotted blood. In another embodiment, the antibody may be targeted to a predetermined target associated with a reaction to an organ transplanted into the patient. The particular target and antibody may be specific to, but not limited to, the type of organ transplant. The antibody may have an affinity for a biological molecule associated with a reaction to an organ transplant. The antibody may have an affinity for a cell marker or markers associated with a reaction to an organ transplant. The marker or markers may be selected such that they represent a viable target on T cells or B cells of the immune system. In another embodiment, the antibody may be targeted to a predetermined target associated with a toxin in the patient. A toxin is defined as any poison produced by an organism including, but not limited to, bacterial toxins, plant toxins, insect toxin, animal toxins, and man-made toxins. The particular target and antibody may be specific to, but not limited to, the type of toxin. The antibody may have an affinity for the toxin or a biological molecule associated with a reaction to the toxin. The antibody may have an affinity for a cell marker or markers associated with a reaction to the toxin. In another embodiment, the antibody may be targeted to a predetermined target associated with a hormone-related disease. The particular target and antibody may be specific to, but not limited to, a particular hormone disease. The antibody may have an affinity for a hormone or a biological molecule associated with the hormone pathway. The antibody may have an affinity for a cell marker or markers associated with the hormone disease. In another embodiment, the antibody may be targeted to a predetermined target associated with non-cancerous diseased tissue. The particular target and antibody may be specific to, but not limited to, a particular non-cancerous diseased tissue, such as non-cancerous diseased deposits and precursor deposits. The antibody may have an affinity for a biological molecule associated with the non-cancerous diseased tissue. The antibody may have an affinity for a cell marker or markers associated with the non-cancerous diseased tissue. In another embodiment, the antibody may be targeted to a proteinaceous pathogen. The particular target and antibody may be specific to, but not limited to, a particular proteinaceous pathogen. The antibody may have an affinity for a proteinaceous pathogen or a biological molecule associated with the proteinaceous pathogen. The antibody may have an affinity for a cell marker or markers associated with the proteinaceous pathogen. For prion diseases, also known as transmissible spongiform encephalopathies, a predetermined target may be, for example, Prion protein 3F4.

See, e.g., U.S. Pat. Appl. Pub. No. 20050090732 (in particular Table I), incorporated herein by reference for a list of targets, cell-specific markers (e.g., antigens for targeting with an antibody moiety), antibodies, and indications associated with those targets, cell-specific markers, and antigens/antibodies.

In some embodiments, the technology finds use in imaging, such as for in situ hybridization (ISH). In some embodiments, the nucleotide analogs provided herein find use in nucleic acids that are hybridization probes for ISH and fluorescence in situ hybridization (FISH). In some embodiments, the nucleotide analogs find use in direct ISH and/or for immuno-histochemistry applications without using secondary detection reagents.

7. Pharmaceutical Formulations

In some embodiments, nucleotide analogs, oligonucleotides comprising a nucleotide analog, etc. are provided in a pharmaceutical formulation for administration to a subject. It is generally contemplated that the compounds (e.g., nucleotide analogs, oligonucleotides comprising a nucleotide analog, conjugates of nucleotide analogs and/or oligonucleotides comprising a nucleotide analog, etc.) related to the technology are formulated for administration to a mammal, and especially to a human with a condition that is responsive to the administration of such compounds. Therefore, where contemplated compounds are administered in a pharmacological composition, it is contemplated that the contemplated compounds are formulated in admixture with a pharmaceutically acceptable carrier. For example, contemplated compounds can be administered orally as pharmacologically acceptable salts, or intravenously in a physiological saline solution (e.g., buffered to a pH of about 7.2 to 7.5). Conventional buffers such as phosphates, bicarbonates, or citrates can be used for this purpose. Of course, one of ordinary skill in the art may modify the formulations within the teachings of the specification to provide numerous formulations for a particular route of administration. In particular, contemplated compounds may be modified to render them more soluble in water or other vehicle, which for example, may be easily accomplished with minor modifications (salt formulation, esterification, etc.) that are well within the ordinary skill in the art. It is also well within the ordinary skill of the art to modify the route of administration and dosage regimen of a particular compound to manage the pharmacokinetics of the present compounds for maximum beneficial effect in a patient.

In certain pharmaceutical dosage forms, prodrug forms of contemplated compounds may be formed for various purposes, including reduction of toxicity, increasing the organ or target cell specificity, etc. Among various prodrug forms, acylated (acetylated or other) derivatives, pyridine esters, and various salt forms of the present compounds are preferred. One of ordinary skill in the art will recognize how to modify the present compounds to prodrug forms to facilitate delivery of active compounds to a target site within the host organism or patient. One of ordinary skill in the art will also take advantage of favorable pharmacokinetic parameters of the prodrug forms, where applicable, in delivering the present compounds to a targeted site within the host organism or patient to maximize the intended effect of the compound. Similarly, it should be appreciated that contemplated compounds may also be metabolized to their biologically active form, and all metabolites of the compounds herein are therefore specifically contemplated. In addition, contemplated compounds (and combinations thereof) may be administered in combination with yet further agents.

With respect to administration to a subject, it is contemplated that the compounds be administered in a pharmaceutically effective amount. One of ordinary skill recognizes that a pharmaceutically effective amount varies depending on the therapeutic agent used, the subject's age, condition, and sex, and on the extent of the disease in the subject. Generally, the dosage should not be so large as to cause adverse side effects, such as hyperviscosity syndromes, pulmonary edema, congestive heart failure, and the like. The dosage can also be adjusted by the individual physician or veterinarian to achieve the desired therapeutic goal.

As used herein, the actual amount encompassed by the term “pharmaceutically effective amount” will depend on the route of administration, the type of subject being treated, and the physical characteristics of the specific subject under consideration. These factors and their relationship to determining this amount are well known to skilled practitioners in the medical, veterinary, and other related arts. This amount and the method of administration can be tailored to maximize efficacy but will depend on such factors as weight, diet, concurrent medication, and other factors that those skilled in the art will recognize.

Pharmaceutical compositions preferably comprise one or more compounds of the present technology associated with one or more pharmaceutically acceptable carriers, diluents, or excipients. Pharmaceutically acceptable carriers are known in the art such as those described in, for example, Remingtons Pharmaceutical Sciences, Mack Publishing Co. (A. R. Gennaro edit. 1985), explicitly incorporated herein by reference for all purposes.

Accordingly, in some embodiments, the immunotherapeutic agent is formulated as a tablet, a capsule, a time release tablet, a time release capsule; a time release pellet; a slow release tablet, a slow release capsule; a slow release pellet; a fast release tablet, a fast release capsule; a fast release pellet; a sublingual tablet; a gel capsule; a microencapsulation; a transdermal delivery formulation; a transdermal gel; a transdermal patch; a sterile solution; a sterile solution prepared for use as an intramuscular or subcutaneous injection, for use as a direct injection into a targeted site, or for intravenous administration; a solution prepared for rectal administration; a solution prepared for administration through a gastric feeding tube or duodenal feeding tube; a suppository for rectal administration; a liquid for oral consumption prepared as a solution or an elixir; a topical cream; a gel; a lotion; a tincture; a syrup; an emulsion; or a suspension.

In some embodiments, the time release formulation is a sustained-release, sustained-action, extended-release, controlled-release, modified release, or continuous-release mechanism, e.g., the composition is formulated to dissolve quickly, slowly, or at any appropriate rate of release of the compound over time.

In some embodiments, the compositions are formulated so that the active ingredient is embedded in a matrix of an insoluble substance (e.g., various acrylics, chitin) such that the dissolving compound finds its way out through the holes in the matrix, e.g., by diffusion. In some embodiments, the formulation is enclosed in a polymer-based tablet with a laser-drilled hole on one side and a porous membrane on the other side. Stomach acids push through the porous membrane, thereby pushing the drug out through the laser-drilled hole. In time, the entire drug dose releases into the system while the polymer container remains intact, to be excreted later through normal digestion. In some sustained-release formulations, the compound dissolves into the matrix and the matrix physically swells to form a gel, allowing the compound to exit through the gel's outer surface. In some embodiments, the formulations are in a micro-encapsulated form, e.g., which is used in some embodiments to produce a complex dissolution profile. For example, by coating the compound around an inert core and layering it with insoluble substances to form a microsphere, some embodiments provide more consistent and replicable dissolution rates in a convenient format that is combined in particular embodiments with other controlled (e.g., instant) release pharmaceutical ingredients, e.g., to provide a multipart gel capsule.

In some embodiments, the pharmaceutical preparations and/or formulations of the technology are provided in particles. “Particles” as used herein in a pharmaceutical context means nano- or microparticles (or in some instances larger) that can consist in whole or in part of the compounds as described herein. The particles may contain the preparations and/or formulations in a core surrounded by a coating, including, but not limited to, an enteric coating. The preparations and/or formulations also may be dispersed throughout the particles. The preparations and/or formulations also may be adsorbed into the particles. The particles may be of any order release kinetics, including zero order release, first order release, second order release, delayed release, sustained release, immediate release, and any combination thereof, etc. The particle may include, in addition to the preparations and/or formulations, any of those materials routinely used in the art of pharmacy and medicine, including, but not limited to, erodible, nonerodible, biodegradable, or nonbiodegradable materials or combinations thereof. The particles may be microcapsules which contain the formulation in a solution or in a semi-solid state. The particles may be of virtually any shape.

Both non-biodegradable and biodegradable polymeric materials can be used in the manufacture of particles for delivering the preparations and/or formulations. Such polymers may be natural or synthetic polymers. The polymer is selected based on the period of time over which release is desired. Bioadhesive polymers of particular interest include bioerodible hydrogels described by H. S. Sawhney, C. P. Pathak and J. A. Hubell in Macromolecules, (1993) 26: 581-587, the teachings of which are incorporated herein by reference. These include polyhyaluronic acids, casein, gelatin, glutin, polyanhydrides, polyacrylic acid, alginate, chitosan, poly(methyl methacrylates), poly(ethyl methacrylates), poly(butylmethacrylate), poly(isobutyl methacrylate), poly(hexylmethacrylate), poly(isodecyl methacrylate), poly(lauryl methacrylate), poly(phenylmethacrylate), poly(methyl acrylate), poly(isopropyl acrylate), poly(isobutyl acrylate), and poly(octadecyl acrylate).

The technology also provides methods for preparing stable pharmaceutical preparations containing aqueous solutions of the compounds or salts thereof to inhibit formation of degradation products. A solution is provided that contains the compound or salts thereof and at least one inhibiting agent. The solution is processed under at least one sterilization technique prior to and/or after terminal filling the solution in the sealable container to form a stable pharmaceutical preparation. The present formulations may be prepared by various methods known in the art so long as the formulation is substantially homogenous, e.g., the pharmaceutical is distributed substantially uniformly within the formulation. Such uniform distribution facilitates control over drug release from the formulation.

In some embodiments, the compound is formulated with a buffering agent. The buffering agent may be any pharmaceutically acceptable buffering agent. Buffer systems include citrate buffers, acetate buffers, borate buffers, and phosphate buffers. Examples of buffers include citric acid, sodium citrate, sodium acetate, acetic acid, sodium phosphate and phosphoric acid, sodium ascorbate, tartartic acid, maleic acid, glycine, sodium lactate, lactic acid, ascorbic acid, imidazole, sodium bicarbonate and carbonic acid, sodium succinate and succinic acid, histidine, and sodium benzoate and benzoic acid. In some embodiments, the compound is formulated with a chelating agent. The chelating agent may be any pharmaceutically acceptable chelating agent. Chelating agents include ethylenediaminetetraacetic acid (also synonymous with EDTA, edetic acid, versene acid, and sequestrene), and EDTA derivatives, such as dipotassium edetate, disodium edetate, edetate calcium disodium, sodium edetate, trisodium edetate, and potassium edetate. Other chelating agents include citric acid and derivatives thereof. Citric acid also is known as citric acid monohydrate. Derivatives of citric acid include anhydrous citric acid and trisodiumcitrate-dihydrate. Still other chelating agents include niacinamide and derivatives thereof and sodium desoxycholate and derivatives thereof.

In some embodiments, the compound is formulated with an antioxidant. The antioxidant may be any pharmaceutically acceptable antioxidant. Antioxidants are well known to those of ordinary skill in the art and include materials such as ascorbic acid, ascorbic acid derivatives (e.g., ascorbylpalmitate, ascorbylstearate, sodium ascorbate, calcium ascorbate, etc.), butylated hydroxy anisole, buylated hydroxy toluene, alkylgallate, sodium meta-bisulfate, sodium bisulfate, sodium dithionite, sodium thioglycollic acid, sodium formaldehyde sulfoxylate, tocopherol and derivatives thereof, (d-alpha tocopherol, d-alpha tocopherol acetate, dl-alpha tocopherol acetate, d-alpha tocopherol succinate, beta tocopherol, delta tocopherol, gamma tocopherol, and d-alpha tocopherol polyoxyethylene glycol 1000 succinate) monothioglycerol, and sodium sulfite. Such materials are typically added in ranges from 0.01 to 2.0%.

In some embodiments, the compound is formulated with a cryoprotectant. The cryoprotecting agent may be any pharmaceutically acceptable cryoprotecting agent. Common cryoprotecting agents include histidine, polyethylene glycol, polyvinyl pyrrolidine, lactose, sucrose, mannitol, and polyols.

In some embodiments, the compound is formulated with an isotonicity agent. The isotonicity agent can be any pharmaceutically acceptable isotonicity agent. This term is used in the art interchangeably with iso-osmotic agent, and is known as a compound which is added to the pharmaceutical preparation to increase the osmotic pressure, e.g., in some embodiments to that of 0.9% sodium chloride solution, which is iso-osmotic with human extracellular fluids, such as plasma. Preferred isotonicity agents are sodium chloride, mannitol, sorbitol, lactose, dextrose and glycerol.

The pharmaceutical preparation may optionally comprise a preservative. Common preservatives include those selected from the group consisting of chlorobutanol, parabens, thimerosol, benzyl alcohol, and phenol. Suitable preservatives include but are not limited to: chlorobutanol (0.3-0.9% w/v), parabens (0.01-5.0%), thimerosal (0.004-0.2%), benzyl alcohol (0.5-5%), phenol (0.1-1.0%), and the like.

In some embodiments, the compound is formulated with a humectant to provide a pleasant mouth-feel in oral applications. Humectants known in the art include cholesterol, fatty acids, glycerin, lauric acid, magnesium stearate, pentaerythritol, and propylene glycol.

In some embodiments, an emulsifying agent is included in the formulations, for example, to ensure complete dissolution of all excipients, especially hydrophobic components such as benzyl alcohol. Many emulsifiers are known in the art, e.g., polysorbate 60.

For some embodiments related to oral administration, it may be desirable to add a pharmaceutically acceptable flavoring agent and/or sweetener. Compounds such as saccharin, glycerin, simple syrup, and sorbitol are useful as sweeteners.

8. Administration, Treatments, and Dosing

In some embodiments, the technology relates to methods of providing a dosage of a nucleotide analog, oligonucleotide comprising a nucleotide analog, or a conjugate thereof (e.g., comprising a targeting moiety, contrast agent, label, tag, etc.) to a subject. In some embodiments, a compound, a derivative thereof, or a pharmaceutically acceptable salt thereof, is administered in a pharmaceutically effective amount. In some embodiments, a compound, a derivative thereof, or a pharmaceutically acceptable salt thereof, is administered in a therapeutically effective dose.

The dosage amount and frequency are selected to create an effective level of the compound without substantially harmful effects. When administered orally or intravenously, the dosage of the compound or related compounds will generally range from 0.001 to 10,000 mg/kg/day or dose (e.g., 0.01 to 1000 mg/kg/day or dose; 0.1 to 100 mg/kg/day or dose).

Methods of administering a pharmaceutically effective amount include, without limitation, administration in parenteral, oral, intraperitoneal, intranasal, topical, sublingual, rectal, and vaginal forms. Parenteral routes of administration include, for example, subcutaneous, intravenous, intramuscular, intrastemal injection, and infusion routes. In some embodiments, the compound, a derivative thereof, or a pharmaceutically acceptable salt thereof, is administered orally.

In some embodiments, a single dose of a compound or a related compound is administered to a subject. In other embodiments, multiple doses are administered over two or more time points, separated by hours, days, weeks, etc. In some embodiments, compounds are administered over a long period of time (e.g., chronically), for example, for a period of months or years (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or more months or years). In such embodiments, compounds may be taken on a regular scheduled basis (e.g., daily, weekly, etc.) for the duration of the extended period.

The technology also relates to methods of treating a subject with a drug appropriate for the subject's malady. According to another aspect of the technology, a method is provided for treating a subject in need of such treatment with an effective amount of a compound or a salt thereof. The method involves administering to the subject an effective amount of a compound or a salt thereof in any one of the pharmaceutical preparations described above, detailed herein, and/or set forth in the claims. The subject can be any subject in need of such treatment. In the foregoing description, the technology is in connection with a compound or salts thereof. Such salts include, but are not limited to, bromide salts, chloride salts, iodide salts, carbonate salts, and sulfate salts. It should be understood, however, that the compound is a member of a class of compounds and the technology is intended to embrace pharmaceutical preparations, methods, and kits containing related derivatives within this class. Another aspect of the technology then embraces the foregoing summary but read in each aspect as if any such derivative is substituted wherever “compound” appears.

In some embodiments, a subject is tested to assess the presence, the absence, or the level of a malady and/or a condition. Such testing is performed, e.g., by assaying or measuring a biomarker, a metabolite, a physical symptom, an indication, etc., to determine the risk of or the presence of the malady or condition. In some embodiments, the subject is treated with a compound based on the outcome of the test. In some embodiments, a subject is treated, a sample is obtained and the level of detectable agent is measured, and then the subject is treated again based on the level of detectable agent that was measured. In some embodiments, a subject is treated, a sample is obtained and the level of detectable agent is measured, the subject is treated again based on the level of detectable agent that was measured, and then another sample is obtained and the level of detectable agent is measured. In some embodiments, other tests (e.g., not based on measuring the level of detectable agent) are also used at various stages, e.g., before the initial treatment as a guide for the initial dose. In some embodiments, a subsequent treatment is adjusted based on a test result, e.g., the dosage amount, dosage schedule, identity of the drug, etc. is changed. In some embodiments, a patient is tested, treated, and then tested again to monitor the response to therapy and/or change the therapy. In some embodiments, cycles of testing and treatment may occur without limitation to the pattern of testing and treating, the periodicity, or the duration of the interval between each testing and treatment phase. As such, the technology contemplates various combinations of testing and treating without limitation, e.g., test/treat, treat/test, test/treat/test, treat/test/treat, test/treat/test/treat, test/treat/test/treat/test, test/treat/test/test/treat/treat/treat/test, treat/treat/test/treat, test/treat/treat/test/treat/treat, etc.

Although the disclosure herein refers to certain illustrated embodiments, it is to be understood that these embodiments are presented by way of example and not by way of limitation. 

1-49. (canceled)
 50. A composition comprising a nucleotide analog having a structure according to:

wherein B is a base and P comprises a phosphate moiety.
 51. The composition of claim 50 wherein P comprises a tetraphosphate, a triphosphate, a diphosphate, a monophosphate, a modified tetraphosphate, a modified triphosphate, a modified diphosphate, or a modified monophosphate.
 52. The composition of claim 50 wherein B is selected from the group consisting of cytosine, guanine, adenine, thymine, and uracil.
 53. The composition of claim 50 wherein B comprises a purine, a pyrimidine, a modified purine, or a modified pyrimidine
 54. The composition of claim 50 wherein the nucleotide analog comprises a thio-alkynyl, thio-propargyl, 3′-S-propargyl, thiofuranose, thioribose, thiodeoxyribose, arabinose, or a modified sugar.
 55. The composition of claim 50 wherein P comprises a 5′ hydroxyl, an alpha thiophosphate, a beta thiophosphate, a gamma thiophosphate, an alpha methylphosphonate, a beta methylphosphonate, or a gamma methylphosphonate.
 56. The composition of claim 50 further comprising a polymerase, a nucleic acid, or a nucleotide.
 57. The composition of claim 50 wherein the nucleotide analog is modified with a sulfur.
 58. A method for synthesizing a modified nucleic acid, the method comprising: a) providing a nucleotide analog comprising an alkyne group; b) linking a nucleic acid to the nucleotide analog to produce a modified nucleic acid comprising the nucleotide analog.
 59. The method of claim 58 wherein the nucleotide analog has a structure according to:

wherein B is a base and P comprises a triphosphate moiety.
 60. The method of claim 58 further comprising providing a template, a primer, a nucleotide, and/or a polymerase.
 61. The method of claim 58 wherein a polymerase catalyzes the linking.
 62. The method of claim 58 further comprising reacting the modified nucleic acid with an azide moiety.
 63. The method of claim 58 further comprising reacting the modified nucleic acid with an azide moiety and a copper-based catalyst reagent.
 64. The method of claim 58 comprising terminating polymerization with the nucleotide analog.
 65. The method of claim 58 further comprising reacting the modified nucleic acid with an adaptor oligonucleotide comprising an azide moiety, an adaptor oligonucleotide comprising a barcode and comprising an azide moiety, or a barcode oligonucleotide comprising an azide moiety to produce a nucleic acid-oligonucleotide conjugate.
 66. A kit for synthesizing a modified nucleic acid, the kit comprising: a) a nucleotide analog comprising an alkynyl group; and b) a copper-based catalyst reagent.
 67. The kit of claim 66 further comprising a polymerase.
 68. The kit of claim 66 further comprising an adaptor oligonucleotide comprising an azide moiety.
 69. The kit of claim 66 further comprising a nucleotide. 